(12) INTERNATIONAL APP 



. lOffeCi^r^I^ O8JUL2S04 

ION PUBLISHED UNDER THE PATENT COOPEl^PbN TREATY (PCT) 



(19) World Intellectual Property Organizatioo 
International Bureau 

(43) International Publication Date 
24 July 2003 (24.07.2003) 




PCT 



(10) international Publication Number 

wo 03/061336 Al 



(51) International Patent Classification^: H04R 1/40, 
3/00// 5/027 

(21) International Application Number: PCT/US03/00741 

(22) Internationat Filing Date: 10 January 2003 (10.01.2003) 

(25) Filing Language: English 

(26) Publication Language: English 

(30) Priority Data: 

60/347,656 1 1 January 2002 (1 LOl.2002) US 

10/315,502 10 December 2002 (10. 12,2002) US 

(71) Applicant (for all designated States except USJ: MH 
ACOUSTICS, LLC [US/US]; 26 Blackburn Place, Sum- 
mit, NY 07901 (US). 



(72) Inventors; and 

(75) Inventors/Applicants (for US only): ELKO, Gary, W. 
[USAJS]; 26 Blackburn Place, Summit, NJ 07901 (US). 
KUBLI, Robert, A. [US/US]; 2661 Cresi Lane, Scotch 
Plains, NJ 07076 (US). MEYER, Jens [DEAJS]; 46 
Franklin Place, Summit, NJ 07901 (US). 

(74) Agent: MENDELSOHN, Steve; Mendelsohn & Asso- 
ciates, P.C., Suite 715, 1515 Market Street, Philadelphia, 
PA 19102 (US). 

(81) Designated States (national): AE, AG, AL, AM, AT, AU, 
AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, 

CZ, DE, DK, DM, DZ, EC, EE, ES, H, GB, GD, GE, GH, 
GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, 
LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, 
MX, MZ, NO, NZ, OM, PH, PL, PT, RO, RU, SC, SD, SE, 
SG, SK, SL, TJ, TM, TN, TR, TT, TZ^ UA, UG, US, UZ, 
VC, VN, YU, ZA, ZM, ZW. 

[Continued on next page] 



(54) Title: AUDIO SYSTEM BASED ON AT LEAST SECOND-ORDER EIGENBEAMS 



o 
O 



D- 



s-l 0=0 


ni=0 } 




s»2 


in=0 '• 


-3 ^, 


m=l(Re)l 


8=4 


m=l^Ini)l 


Decotnposer 

(Eigen- 
Bearafonner) 


ni=0 1 


m-I(Rcli 






in-2fRc)! 


s=S 


iii=2(Im)i 







Steering Unit 



Compcnsstion Unit 



Sunuiuition Unit 



Auditory Scene 



(57) Abstract: A microphone array-based audio 
system that supports representations of auditory 
scenes using second-order (or higher) harmonic 
expansions based on the audio signals generated by 
the microphone array. In one embodiment, a plurality 
of audio sensors are mounted on the surface of an 
acoustically rigid sphere. The number and location of 
the audio sensors on the sphere are designed to enable 
the audio signals generated by those sensors to be 
decomposed into a set of eigenbeams having at least 
one eigenbeam of order two (or higher). Beamforming 
(e.g., steering, weighting, and sununing) can then 
be applied to the resulting eigenbeam outputs to 
generate one or more channels of audio signals that 
can be utilized to accurately render an auditory scene. 
Alternative embodiments include using shapes other 
than spheres, using acoustically soft spheres and/or 
positioning audio sensors in two or mote concentric 
patterns. 



'HlvO 03/061336 Al lillillllilllll#PlllliliilllllllilllliMI 



(84) Designated States (regional): ARIPO patent (GH, GM, 
KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW), 
Eurasian patent (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), 
European patent (AT, BE, EG, CH, CY, CZ, DE, DK, EE, 
ES, FI, FR, GB, GR, HU, IE, IT, LU, MC, ML, PT, SE, SI, 
SK, TR), OAPI patent (BF, BJ, CF, CG, CI, CM, GA. GN, 
GQ, GW, ML, MR, NE, SN, TD, TG). 

Published: 

— with international search report 



— before the expiration of the time limit for amending the 
claims and to be republished in the event of receipt of 
amendments 

For two-letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations" aj^earing at the begin- 
ning of each regular issue of the PCT Gazette. 



#10/500938 

DT04 Rec'd PCT/PTO 0 8 JUL 2004 

AUDIO SYSTEM BASED ON AT LEAST SECOND-ORDER EIGENBEAMS 



WO 03/061336 ^ 

-1- 



Cross-Reference to Related Applications 

This application claims the benefit of the filing date of U.S. provisional application no. 
60/347,656, filed on 01/1 1/02 as attorney docket no. 1053.00 IPROV. 

BACKGROUND OF THE INVENTION 

Field of the Invention 

The present invention relates to acoustics, and, in particular, to microphone arrays. 



Description of the Related Art 

A microphone array-based audio system typically comprises two units: an arrangement of (a) two 
or more microphones (i.e., transducers that convert acoustic signals (i.e., sounds) into electrical audio 
signals) and (b) a beamformer that combines the audio signals generated by the microphones to form an 
auditory scene representative of at least a portion of the acoustic sound field. This combination enables 
picking up acoustic signals dependent on their direction of propagation. As such, microphone arrays are 
sometimes also referred to as spatial filters. Their advantage over conventional directional microphones, 
such as shotgun microphones, is their high flexibility due to the degrees of fi*eedom offered by the plurality 
of microphones and the processing of the associated beamformer. The directional pattern of a microphone 
array can be varied over a wide range. This enables, for example, steering the look direction, adapting the 
pattern according to the actual acoustic situation, and/or zooming in to or out from an acoustic source. All 
this can be done by controlling the beamformer, which is typically implemented in software, such that no 
mechanical alteration of the microphone array is needed. 

There are several standard microphone array geometries. The most common one is the linear 
array. Its advantage is its simphcity with respect to analysis and construction. Other geometries include 
planar arrays, random arrays, circular arrays, and spherical arrays. The spherical array has several 
advantages over the other geometries. The beampattem can be steered to any direction in three- 
dimensional (3-D) space, without changing the shape of the pattern. The spherical array also allows full 
3D control of the beampattem. Notwithstanding these advantages, there is also one major drawback 
Conventional spherical arrays typically require many microphones. As a result, their implementation costs 
are relatively high. 



SUMMARY OF THE INVENTION 
Certain embodiments of the present invention are directed to microphone array-based audio 
systems that are designed to support representations of auditory scenes using second-order (or higjier) 



wo 03/061336 

r/US03/00741 



in one 
on 
lere are 



hannonic expansions based on the audio signals generated by the nucrophone army. For example, 
embodiment, the present invention comprises a plurality of microphones (i.e., audio sensors) mou^t^' 
the surface of an acoustically rigid sphere. The number and location of the audio sensors on &e spher. 
designed to enable the audio signals generated by those sensors to be decomposed in1« a set of eigenbeams 
havmgat least oneeigenbeam of ordertwo(or higher). Beamforming (e.g., steering, wei^ting and 
smmmng) can then be applied to the resulting eigenbeam outputs to generate one or more channels of 
audio signalsthat can be utilized to accuratelyrenderanauditoiyscene. As used in this specification a 

fuU set of dgenbeams of ordernrefers to any set of mutually orthogonal beampattems that formaba^^ 
that can be used to represent any beampattem having order n or lower. 

According to one embodiment, the present invention is a method for p«,cessing audio signals A 

plurality of audio signals are received, where each audio signal has been generated by a different sensor 
amicrophone array. Theplurality of audio signals are decomposed into a plurality of eigenbeam outputs 
wherem each eigenbeam output corresponds to a different eigenbeam for the microphone array and at le^t 
one of the eigenbeams has an order of two or greater. 

According to another embodiment, the present invention is a microphone comprising a plurality of 
sensor, mounted in an arrangement, wherein the number and positions of sensors in the an^ngement 
enable representation of a beampattem for the microphone as a series expansion involving at least one 
second-order eigenbeam. 

According to yet another embodiment, the present invention is a method for generating an auditory 
scene. Eigenbeam outputs are received, the eigenbeam outputs having been generated by decomposing a 
plurahty of audio signals, each audio signal having been generated by a different sensor of a microphone 
anay. wherem each eigenbeam output corresponds to a different eigenbeam for the microphone array and 
at least one of the eigenbeam outputs corresponds to an eigenbeam having an order of two or greater The 
auditoiysceneisgeneratedbasedontheeigenbeamoutputsandtheircorrespondingeigen^ 

mm DESCRJFnON OF TTTE DRA wmns (\ 
other aspects, features, and advantages of the present invention will become more fiilly apparent 
from the fonowing detafled description, the appended claims, and the accompanying drawings in which 
like reference numerals identify similar or identical elements. 

Fig. 1 shows a block diagram of an audio system, according to one embodiment of the present 
invention; 

Fig. 2 shows a schematic diagram of a possible microphone array for the audio system of Fig 1- 
Fig. 3A shows the mode an^Utude for a continuous anuy on the surface of an acoustically rigid 

sphere 



wo 03/061336 



-3- 



PCT/US03/00741 



Fig. 3B shows the mode amplitude for a continuous array elevated over the surface of an 
acoustically rigid sphere; 

Figs. 4 and 5 show the mode magnitude for velocity sensors oriented radially at r^~\.05a and Lla, 
respectively; 

Fig. 6 shows the mode magnitude for a continuous array centered around an acoustically soft 
sphere at distance i=l.la; 

Fig. 7 shows velocity modes on the surfece of a soft sphere; 

Figs. 8 A-D show normalized pressure mode amplitude on the surface of a rigid sphere for 
spherical wave incidence for various distances r/ of the sound source; 

Fig. 9 identifies the positions of the centers of the faces of a truncated icosahedron in spherical 
coordinates, where the angles are specified in degrees; 

Fig. 10 shows the 3-D directivity pattern of a third-order hypercardiqid pattern at 4 kHz using the 
truncated icosahedron array on the surface of a sphere of radius 5 cm; 

Fig. 1 1 shows the white noise gain (WNG) of hypercardioid patterns of different order 
implemented with the truncated icosahedron array on a sphere with a=5cn^ 

Fig. 12 shows the principle filter shape to generate a hypercardioid pattern with a guaranteed 
minimum WNG; 

Fig. 13 shows the maximmn directivity index (DI) for a sphere with a=5cm, allowing spherical 
harmonics up to order iV, where the WNG is arbitrarj^ 

Fig, 14 shows the WNG corresponding to maximum DI from Fig. 13 for a sphere with a=5cm; 

Fig. 1 5 shows the maximmn DI with different constraints on the WNG for N-3 ; 

Figs. 16A-B show coefficients Cn(co) for maximum DI design with N=3 and WNG>-5; 

Fig. 17 provides a generalized representation of audio systems of the present invention; 

Fig. 18 represents the structure of an eigenbeam former, such as the generic decomposer of Fig. 17 
and the second-order decomposer of Fig. 1; 

Fig. 19 represents the structure of steering units, such as the generic steering unit of Fig. 17 and the 
second-order steering unit of Fig. 1; 

Fig. 20A shows the frequency weighting function of the output of the decomposer of Fig. 1, while 
Fig. 20B shows the corresponding frequency response correction that should be applied by the 
compensation unit of Fig. 1 ; 

Fig. 21 shows a graphical representation of Equation (61); 

Figs. 22A and 22B show mode strength for second-order and third-order modes, respectively; 
Fig. 22C graphically represents normalized sensitivity of a circular patch-microphone to a spherical 
mode of order n; 



wo 03/061336 

r/US03/00741 



Figs. 23A-D shows principle pressure distribution for real parts of third-order harmonics, from left 
to nght: .13^ and Y^^ (where S direction has to be scaled by sind); 

Fig. 24 shows a preferred patch microphone layout for a 24.element spherical array; 

Fig. 25 illustrates an integrated microphone scheme involving standard electret microphone point 
sensors and patch sensors; 

Fig. 26 illustrates a sampled patch microphone; 

Fig. 26A illustrates a sensor mounted at an elevated position over Ihe surface of a (partially 
depicted) sphere; 

Fig. 26B graphically illustrates the directivity due to the natural dif&action of arigid sphere for a 
pressure sensor mounted on the surface of a sphere at 

Fig. 27 shows a block diagram of a portion of the audio system of Fig. 1 according to an 
implementation in which an equalization filter is configured between each microphone and the modal 

decomposer; 

Fig. 28 shows a block diagram of the calibration method for the n* microphone equaKzation filter 
v„(t), according to one embodiment of the present invention; and 

Fig. 29 shows a cross-sectional view of the calibration configuration of a calibration probe over an 
audio sensor of a spherical microphone airay, such as the anay of Fig. 2, according to one embodiment of 
the present invention. 



DETAILED DR SCRIPTTON 
According to certain embodiments of the present invention, a microphone array generates a 
plurality of (time-varying) audio signals, one from each audio sensor in the anay. The audio signals are 
then decomposed (e.g., by a digital signal processor or an analog multiplication network) into a (time- 
vaiying) series expansion involving discretely sampled, (at least) second-order (e.g., spherical) hannonics. 
where each term in the series expansion coiresponds to the (time-vaiying) coefficient for a different three- 
dimensional eigenbeam. Note feat a discrete second-order harmonic expansion involves zero-, Grst-, and 
second^rder eigenbeams. The set of dgenbeams form an orthononnal set such that the inner-prodult 
between any two discretely sampled eigenbeams at the microphone locations, is ideally zero and the inner- 
product of any discretely sampled eigenbeam with itself is ideaUy one. This characteristic is referred to 
herein as the discrete orthonormalily condition. Note that, in real-world in^lementations in which 
relatively small tolerances are allowed, the discrete orlbononnaHty condition may be said to be satisfied 
when (1) the imier-product between any two different discretely sampled eigenbeams is zero or at least 
close to zero and (2) the inner-product of any discretely sampled eigenbeam with itself is one or at least 
close to one. The time-varying coefficients corresponding to the different eigenbeams are referred to 
herein as eigenbeam outputs, one for each different eigenbeam. Beamforming can then be performed 



wo 03/061336 




PCT/US03/00741 



-5- 



(eifher in real-time or subsequently, and either locally or remotely, depending on the application) to create 
an auditory scene by selectively applying different wei^ting factors to the different eigenbeam outputs 
and summing together the resulting weighted eigenbeams. 

In order to make a second-order harmonic expansion practicable, embodiments of the present 
invention are based on microphone arrays in which a su£5cient number of audio sensors are mounted on 
ttie surface of a suitable structure in a suitable pattern. For example, in one embodiment, a number of 
audio sensors are moxmted on the surface of an acoustically rigid sphere in a pattern that satisfies or nearly 
satisfies the above-mentioned discrete orthonormality condition. (Note that the present invention also 
covers embodiments whose sets of beams are mutually orthogonal without requiring all beams to be 
normalized.) As used in this specification, a structure is acoustically rigid if its acoustic impedance is 
much larger than the characteristic acoustic inqjedance of the medium surrounding it The highest 
available order of the harmonic expansion is a function of the number and location of the sensors in the 
microphone array, the upper fi"equency limit, and the radius of the sphere. 

Fig, 1 shows a block diagram of a second-order audio system 100, according to one embodiment of 
the present invention. Audio system 100 comprises a plurahty of audio sensors 102 configured to form a 
microphone array, a modal decomposer (i.e., eigenbeam former) 104, and a modal beamformer 106. In 
this particular embodiment, modal beamformer 106 comprises steering unit 108, compensation unit 110, 
and summation unit 112, each of which will be discussed in further detail later in this specification in 
conjunction with Figs. 18-20. 

Each audio sensor 102 in system 100 generates a time-varying analog or digital (depending on the 
implementation) audio signal corresponding to the sound incident at the location of that sensor. Modal 
decomposer 104 deconq^oses the audio signals generated by the different audio sensors to generate a set of 
time-varying eigenbeam outputs, where each eigenbeam output corresponds to a different eigenbeam for 
the microphone array. These eigenbeam outputs are then processed by beamformer 106 to generate an 
auditory scene. In this specification, the term "auditory scene", is used generically to refer to any desired 
output fi'om an audio system, such as system 100 of Fig. 1. The definition of the particular auditory scene 
will vary fi-om apphcation to application. For example, the output generated by beamformer 106 may 
correspond to one or more output signals, e.g., one for each speaker used to generate the resultant auditory 
scene. Moreover, depending on the application, beamformer 106 may simultaneously generate 
beampattems for two or more different auditory scenes, each of which can be independently steered to any 
direction in space. 

Id certain implementations of system 100, audio sensors 102 are mounted on the surface of an 
acoustically rigid sphere to form the microphone array. Fig. 2 shows a schematic diagram of a possible 
microphone array 200 for audio system 100 of Fig. 1. In particular, microphone array 200 comprises 32 
audio sensors 102 of Fig. 1 mounted on the surface of an acoustically rigid sphere 202 in a '^truncated 



wo 03/061336 

TAJS03/00741 



-6- 



icosahedron" pattern. This pattern is described in further detail later in this specification in conjunction 
with Fig. 9. Each audio sensor 102 in microphone array 200 generates an audio signal that is transmitted 
to the modal decomposer 104 of Fig. 1 via some suitable (e.g., wired or wireless) comiection (not shown in 
Fig. 2). 

Referring again to Fig. 1, beamformer 106 exploits the geometry of the spherical array of Fig. 2 
and relies on the spherical harmonic decomposition of the incoming sound field by decomposer 104 to 
constmct a desired spatial response. Beamformer 106 can provide continuous steering of the beampattem 
in 3-D space by changing a few scalar multipliers, while the filters determining the beampattem itself 
remain constant Ihe shape of the beampattem is invariant with respect to the steering direction, histead 
of using a filter for each audio sensor as in a conventional filter-and-sum beamformer, beamformer 106 
needs only one filter per spherical harmonic, which can significantly reduce the computational cost. 

Audio system 100 with the spherical array geometry of Fig. 2 enables accurate control over the 
beampattem in 3-D space, hi addition to pencil-like beams, system 100 can also provide multi-direction 
beampattems or toroidal beampattems giving uniform directivity in one plane. These properties can be 
useful for appHcations such as general multichamiel speech pick-up, video conferencing, or direction of 

arrival (DOA) estimation. » can also be used as an analysis tool for room acoustics to measure directional 
properties of the sound field. 

Audio system 100 offers another advantage: it supports decomposition of the sound field into 
mutually orthogonal components, the eigenbeams (e.g., spherical harmonics) that can be used to reproduce 
the sound field. The eigenbeams are also suitable for wave field synthesis (WFS) methods that enable 
spatially accurate sound reproduction in a fairly large volume, allowing reproduction of the sound field that 
is present around the recording sphere. This allows all kinds of general real-time spatial audio 
apphcations. 

Spherical Scatterer 

A plane-wave G firom the z-direction can be expressed according to Equation (I) as follows: 
G(fe-,^,0 = e'-<*''^*^"="^' = t(2«+l)/V„(/b-)P„(cos5)e'- 

where: 

o in general, in spherical coordinates, ;- represents the distance from the origin (i.e., the center of the 
microphone anay), is the angle in the horizontal (i.e., x-y) plane from the x-axis. and ^is the elevation 
angle in fee vertical direction from the z-axis; 

o here the spherical coordinates r and ,9 determine the observation point; 



wo 03/061336 



-7- 



PCT/US03y00741 



0 k rqjresents the wavennmber, equal to ca/c, where c is the speed of sound and o is the frequency of 
the sound in radians/second; 
o t is time; 

o i is the imaginary constant (i.e., 

o j„ Stands for the spherical Bessel function of the first kind of order n; and 

o P„ denotes the Legendre function. 
G can be seen as a function that describes the behavior of a plane-wave from flie z-direction with unity 
magnitude and referenced to the origin. An important characteristic of the spherical Bessel functions j„ is 
that they converge towards zero if the order n is larger than the argument kr. Therefore, only the series 
terms up to approximately n = Ikr] have to be taken into account. In the following sections, the sound 
pressure aroimd acoustically rigid and soft spheres will be derived. 

Acoustically Rijgid Sphere 

From Equation (1), the sound velocity for an impinging plane-wave on the surface of a sphere can 
be derived using Euler's Equation. In theoiy, if the sphere is acoustically rigid, then the sum of the radial 
velocities of the incoming and the reflected sound waves on the surface of the sphere is zero. Using this 
boundary condition, the reflected sound pressure can be determined, and the resulting sound pressure field 
becomes the superposition of the impinging and the reflected sound pressure fields, according to Equation 
(2) as follows: 



where: 

o a is the radius of the sphere; 

o a prime Q denotes the derivative with respect to the argument; and 
o hj^^ represent the spherical Hankel function of the second kind of order n. 
In order to find a general expression that gives the sound pressure at a point [rs, ^s, (ps] for an in:q>inging 
sound wave from direction [i9, p], an addition theorem given by Equation (3) as follows is helpful: 



where ^is the angle between the in^inging sound wave and the radius vector of the observation point. 
Substituting Equation (3) into Equation (2) yields the normalized sound pressure around a spherical 
scatterer according to Equation (4) as follows: 




(2) 




(3) 



wo 03/061336 

-8- 



yUS03/00741 



(4) 

where the coefficients are the radial-dependent terms given by Equation (5) as follows: 



) 



(5) 

To simplify the notation finther, spherical harmonics Fare introduced in Equation (4) resulting in Equation 

(6) as follows: 



G{kr,ka,&,<p) = AnY,fb„{ka,kr,) (3,<p)Yf {&^,(p^) , 

(6) 

where flie superscripted asterisk (*) denotes the complex conjugate. 

Acoustically Soft Sp Tit-re 

In theory, for an acoustically soft sphere, the pressure on the surface is zero. Using this boundary 
condition, the sound pressure field around a soft spherical scatterer is given by Equation (7) as follows: 

G(^,;ia,5) = |(2. + iy (^)jp„(cos5) 

(7) 

Settmg r equal to a. one sees that the boundary condition is fizlfilled. The more geneial expressions for the 
sound pressure, like Equations (4) or (6) do not change, except for using a different b„ given by Equation 
(8) as follows: 

*;'>(*">.)=(y.(*r.)-i^Af(J,)]. 

(8) 

where the superscript (s) denotes the soft scatterer case. 

Spherical Wave IncidencR 

The general case of spherical wave incidence is interesting since it will give an understanding of 
the operation of a spherical microphone array for nearfield sources. Another goal is to obtain an 
understanding of the nearfield4o-&field transition for the spherical an^y Typically, a farfield situation is 
assumed in microphone anay beamforming. This implies that the sound pressure has planar wave-fronts 
and that the sound pressure magnitude is constant over flie array aperture. If the array is too close to a 



wo 03/061336 PCT/US03/00741 

-9- 



sound source, neither assumption will hold In particular, flie wave-fronts will be curved, and the sound 
pressure magnitude will vary over the array aperture, being higher for microphones closer to the sound 
source and lower for those further away. This can cause significant errors in the neaifield beanqjattem (if 
the desired pattern is the farfield beampattem). 

A spherical wave can be described according to Equation (9) as follows: 

KoHkR) 

G{KR,t) = A^—— R>A, 
R 

(9) 

where R is the distance between the source and the microphone, and A can be thought of as the source 
dimension. This brings two advantages: (a) G becomes dimensionless and (b) the problem of i^=0 does 
not occur. With the source location described by the vector r/, the sensor location described by r^, and 0 
being the angle between r/ and Ts , i? may be given according to Equation (10) as follows: 



(10) 



2? = V'/'+'-;-2r;r,cos(6^) 

Eqiiation (9) can be expressed in spherical coordinates according to Equation (1 1) as follows: 
G(fo;,/bj,0) = -i^A2(2n+l)j„(/b;)/zfH/b})i^X 'z >'i> 

B-O 

(11) 

where n is the magnitude of vector r/, and the time dependency has been omitted. If this sound field hits a 
rigid spherical scatterer, the superposition of the m5)ingmg and the reflected sound fields may be given 
according to Equation (12) as follows: 

(12) 

To show the connection to the farfield, assume Atj » 1 . The Hankel function can then be replaced by 
Equation (13) as follows: 



hl'\kr,) « r' - — for kr, » 1 
kr, 

(13) 

Substituting Equation (13) in Equation (12) yields Equation (14) as follows: 



wo 03/061336 

-10- 



/US03/00741 



(14) 

Except for an anq,litude scaling and a phase shift, Equation (14) equals the farfield solution given in 
Equation (6). n,e next section will give more details about the transition fromnearfield to farfield. based 

on the results presented above. 

Modal Beamforming 

Modal beanrfonning is a powerful technique in bean^attem design. Modal beamfonning is based 
on an orthogonal deconposition of the sound field, where each component is multiplied by a given 
coefficient to yield the desired pattern. This procedure will now be described in more detail for a 
contmuous spherical pressure sensor on the surface of a rigid sphere. 

Assume that the continuous spherical microphone array has an aperture weighting function given 

by/i(5,^, a». Since this isacontinuous function onasphere,^can be expandedintoaseries of ^herical 
hannomcs according to Equation (15) as follows: 

Tte^n^^o^rF, which describes the directional response of the anay, is given by Equation (16) as ^'^^ 
follows- 



m9',«) = ^j[A(5„,9'„,fi')C?(i»..<P„.r„,5,<p,a>)rfQ. 



(16) 

le 



where Q symbolizes the 4n space. To simplify fee notation, the anay factor is first computed for a singl. 
mode where is the order and ,„ ' is the degree, m the foUowing analysis, a spherical scatters with 
plane-wave incidence is assumed. Changes to adopt this derivation for a soft scatterer and/or spherical 
wave mcidence are straightforward. For the plane-wave case, the array factor becomes Equation (17) as 
follows: 

mmeansthatthefarfieldpatternfora^gle mode is identical to thesensitiv^tyfl^^^^^ 

except for a frequency-dependent scahng. H^e complete array fector can now be obtained by adding up'all 

modes according to Equation (18) as follows: 



«=0 



wo 03/061336 




PCT/US03/00741 



-11" 



(18) 



Comparing Equation (18) with Equation (15), if Cis normalized ac5cording to Equation (19) as follows: 



then the array factor equals the aperture weighting function. This results in the following steps to 
implement a desired beampattem: 

(1) Determine the desired beampattem h; 

(2) Compute the series coefficients C; 

(3) Normalize the coefficients according to Equation (19); and 

(4) Apply the aperture weighting function of Equation (15) to the array using the normalized 
coefficients from step (3). 

Equation (18) is a spherical harmonic expansion of the array factor. Since the spherical harmonics 
Y are mutually or&ogonal, a desired beampattem can be easily designed. For example, if Coo and Cio are 
chosen to be unity and all other coefficients are set to zero, then the superposition of the omnidirectional 
mode (7o) and the dipole mode {Yi^ will result in a cardioid pattern. 

From Equation (19), the term fb„ plays an important role in the beamforming process. This term 
will be analyzed further in the following sections. Also, the corresponding terms for a velocity sensor, a 
soft sphere, and spherical wave incidence will be given. 

Acoustically Rigid Sphere 

For an array on a rigid sphere, the coefficients bn are given by Equation (5). These coefficients 
give the strength of the mode dependent on the frequency. Fig. 3A shows the magnitude of the 
coefficients bn for orders «=0 to n^6 for an array on the surface of the sphere (r=a), where a continuous 
array of omnidirectional sensors is assumed. In Fig. 3 A, for very low frequencies, only the zero mode is 
present For /ca=0.2 (for a sphere with a radius of a=5 cm, this results in a frequency of about 220 Hz), the 
first mode is down by 20 dB. At higher frequencies, more modes emerge. Once the mode has reached a 
certain level, it can be used to form the directivity pattern. The required level depends on the amount of 
noise and design robustness for the array. For example, in order to use the second-order mode at ^a=0.3, it 
is preferably amplified by about 40 dB. 

Instead of mounting the array of sensors on the surface of the sphere, in alternative embodiments, 
one or more or even all of the sensors can be mounted at elevated positions over the surface of flie sphere. 
Fig. 3B shows the mode coefficients for an elevated array, where the distance between the array and the 
spherical surface is 2a. In contrast to the array on the surface represented in Fig. 3 A, the frequency 
response shown in Fig. 3B has zeros. This limits the usable bandwidth of such an array. One advantage is 



CM 



(19) 



wo 03/061336 

r/US«3/00741 



-12- 



lljjp/] 



that the ampUtude at low frequencies is significantly higher, which aUows higher directivity at lower 
frequencies. 



AcousticanvRiffiH S phere with Vfilncitv Micmp hnTi^Q 

Instead of using pressure sensoi., velocity sensors could be used. From Equation (2). fee radial 
velocity is given by Equation (20) as follows: 

iQ>Po dr 



A J (20) 
According to the boundary condition on the surface of an acoustically rigid sphere, the velocity for ,=» v«ll 
be zero, as indicated by Equation (20). The mode coefficients for the iBdial velocity sensors are given by 
Equation (21) as follows: 



bn{ka,kr)- 



,7 (21) 

Figs. 4 and 5 show the nwde magnitude for velocity sensors oriented i^dially at r.=l .05a and 1 la 
respectively. These sensors behave very differently from the omnidirectional sensors. Forlow 
frequencies, the fix^t-order mode is dominant. TOs is Ihe "native" mode of a velocity sensor. Modezero 
and mode two are also quite strong. This would enable a higher directivity at very low fit^quencies 
compared to the pressure modes. A drawback of the velocity modes is their characteristic to have 
smgularities in the modes in the desired operating frequency range. TTiis means that, befor. a mode is used 
for a directivity pattern, it should be checked to see if it has a singularity for a desired frequency 
Fortunately, the singularities do not appear frequently but show up only once per mode in the typical 
frequency range of interest. THe singularities in the velocity modes correspond to the maxima in the 
pressure modes. They also experience a 90» phase shift (connate Equations (20) and (6)). 

The difference between Fig. 4 and Fig. 5 is the distance of the microphones to the surface of the 
sphere. Comparing the two figures one finds that the sensitivity is higher for a larger distance ITiisistme 
as long as the distance is less than onequarterofawavelengfli. At that distance from a rigid wall the 
velocity has a maximum. For a distance of half the wavelength, the velocity is zero, which means'that the 
distance of the array fi-om the surfece of the sphere should not be increased arbitrarily For d=l la 
distance of m away fi-om the surface con^onds to k^l On. This corresponds to the position of fee : 
in Fig. 5. 



,a 
J zero 



wo 03/061336 



m 



PCT/US03/00741 



-13- 



For a fixed distance, the velocity increases with frequency. This is true as long as the distance is 
greater than one quarter of the wavelength. Since, at the same time, the energy is spread over an increasing 
number of modes, the mode magnitude does not roll off with a dB slope, as is the case for the pressure 
modes. 

Unfortunately, there are no true velocity microphones of very small sizes. Typically, a velocity 
microphone is implemented as an equalized first-order pressure differential microphone. Comparing this 
to Equation (20), the coefficients b„ are then scaled by k Since usually the pressure differential is 
approximated by only the pressure difference between two omnidirectional microphones, an additional 
scaling of 201og(/) is taken into account, where / is the distance between the two microphones. 



Acoustically Soft Sphere 

For a plane-wave impinging onto an acoustically soft sphere, the pressure mode coefficients 
become fb„^^^. The magnitude of these is plotted in Fig. 6 for a distance of 1. la. They look like a mixture 
of the pressure modes and the velocity modes for the rigid sphere. For low fi'equencies, only the zero-order 
mode is present. With increasing firequency, more and more modes emerge. The rising slope is about 6n 
dB, where n is the order of the mode. Similar to the velocity in front of a rigid surface, the pressure in 
front of a soft surface becomes zero at a distance of half of a wavelength away from the surface. Similar to 
the velocity modes in front of a rigid scatterer, the effect of decreasing mode magnitude with an increasing 
number of modes is compensated by the fact that the pressure increases for a fixed distance until the 
distance is a quarter wavelength. Therefore, the mode magnitude remains more or less constant up to this 
point. 

Acoustically Soft Sphere with Velocity Microphones 

For velocity microphones on the surface of a soft sphere, the mode coefficients are given by 
Equation (22) as follows: 



The magnitude of these coefficients is plotted in Fig. 7. They behave similar to the pressure modes for the 
rigid sphere, except that all modes are "shifted" one to the left. They start with a slope of about 6(«-l) dB. 
This is attractive especially for low fi:equencies. For exan:5)le, at Aa=0.2, mode zero and mode one are 
only about 1 3 dB apart, while, for the pressure modes, there is a difference of about 20 dB. Also, between 
mode one and mode two, the gap is reduced by about 4 dB. This configuration will allow hi^ directivity 
for a given signal-to-noise ratio. 




(22) 



wo 03/061336 

^J/US03/00741 

-14- 



m 



One way to inclement an anay with velocity sensors on the surface of a soft sphere might be to 
use vibration s^rsors that detect the normal velocity at the surface. Howev... the bigger problem will be to 
burld a soft sphere. The term "soft" ideally means that the specific impedance ofihe sphere is zero In 
practice, it will be sufficient if the impedance of the sphere is much less that the impedance of the medium 
surroundmg the sphere. Since the specific impedance of air is quite low (Z.=poc=^l4 kg/m^s). building a 
soft sphere for airborne sound in essentially infeasible. However, a soft sphere can be implemented for 
unden^ater applications. Since water has a specific impedance of 1.48*10* kg/m^s. an elastic shell fiUed 
with air could be used as a soft sphere. 

Spherical Wave hicidencR 

This section describes the case of a spherical wave impinging onto a rigid spherical scatterer 
Smce the pressure modes are the most practical ones, only they wiD be covered. Hre results will give an 
understandmg of the nearfield-to-farfield transition. 

According to Equation (12), the mode coefBcients for spherical sound incidence are given by 
Equation (23) as follows: 

(23) 

where the superscript (p) indicates spherical wave incidence. Tto mode coefficients are a scaled version of 
the farfield pressure modes. 

m Figs. 8A-D, the magnitude of the modes is plotted for various distances r, of the sound source 
For short distances of the sound source, fte hi^er modes are of hi^er magnitude at low lea. Hrey also do 

not show the 6« dB increase but are relatively constant This behavior can be explai^^ 
low argmnent limit of the scalmg factor given by Equation (24) as follows: 

^-(^, = .I|±PJ^^,,..^^<<, 

Tl^us, for low kr„ the scahng factor has a slope of about -6„ dB, which compensates the 6n dB slope off 

and results inaconstant. Tire appearance of fcehigher-order modes at low fcr'sbecomes clear by keepJ 
m mind that the modes correspond to a spherical harmonic decomposition of the sound pressure 

drstributiononthesurfaceofthesphere. T^e shorter the distance of the source fi«m the sphere, the more 
unequal will be the sound pressure distribution evenfor low frequencies, and this wiUresult in hi^^^ 
ordertermsmthesphericalharmonicsseries. This also means that, for short sour^ distances ahigher 
drrectrvrty at low frequencies could be achieved since more modes can be used for the beampattem 
However.thisbeampattem Will be valid onlyforthedesignedsource distance. For all other distances the 
modes will experience a scaling that will result in the beampattem given by Equation (25) as follows- ' 



wo 03/061336 




PCT/US03/00741 



-15- 





is unity (about 0 dB) for ka^O. This normalization removes the l/r/ dependency for point sources. 

For the high argument limit, it was already shown that the mode coefficients are equal to the plane- 
wave incidence. Comparing the spherical wave incidence for larger source distances (Fig. 8D, rf=10a) 
with plane-wave incidence (Fig. 3 A), one finds only small differences for low ka. For example, at /ca=0.2, 
mode one is about 1 to 2 dB stronger ibr the spherical wave incidence. Since the array is preferably 
designed robust against magnitude and phase errors, these small deviations are not expected to cause 
significant degradation in the array performance. Therefore, a source distance of about ten times the radius 
of the sphere can be regarded as farfield. 

Sampling the Sphere 

So far, only a continuous array has been treated. On the other hand, an actual array is implemented 
using a finite number of sensors corresponding to a sampling of the continuous array, hituitively, this 
sampling should be as uniform as possible. Unfortunately, there exist only five possibihties to divide the 
surface of a sphere in equivalent areas. These five geometries, which are known as regular polyhedrons or 
Platonic Solids, consist of 4, 6, 8, 12, and 20 faces, respectively. Another geometry that comes close to a 
regular division is the so-called truncated icosahedron, which is an icosahedron having vertices cut off. 
Thus, the term "truncated." This results in a sohd consisting of 20 hexagons and 12 pentagons. A 
microphone array based on a truncated icosahedron is referred to herein as a TIA (truncated icosahedron 
array). Fig. 9 identifies the positions of the centers of the faces of a truncated icosahedron in spherical 
coordinates, where the angles are specified in degrees. Fig. 2 illustrates the microphone locations for a 
TIA on the surface of a sphere. 

Other possible microphone arrangements include the center of the faces (20 microphones) of an 
icosahedron or the center of the edges of an icosahedron (30 microphones), hi general, the more 
microphones used, the higher will be the upper nrndmum firequency. On the other hand, the cost usually 
increases with the number of microphones. 

Referring again to the TIA of Figs. 2 and 9, each microphone positioned at the center of a 
pentagon has five neighbors at a distance of 0.65a, where a is the radius of the sphere. Each microphone 
positioned at the center of a hexagon has six neigihbors, of which three are at a distance of 0.65a and the 
other three are at a distance of 0.73fl. Applying the sampling theorem {d<0J2^ d being the distance of fiie 



wo 03/061336 

^■|/US03/0074] 

sensors, ^ being the wavelength) and, taldng the wor^ case, &e ma«^ 
(26) as follows: 



2*0.73fl 



(26) 

wher. . ,s the speed of sound. For a sphere wiA radius ^5cm, this results in an upper fequency limit of 
4.7 kHz. in practice, a shghtly higher maximum frequency can be expected since most microphone 
distances are less than 0.73a, namely 0.65.. Tl.e upper frequency limit can be increased by reducing the 
radius of the sphere. On the other hand, reducing the radius of the sphere would reduce the achievable 
directivity at low frequencies. Therefore, a radius of 5cm is a good compromise. 

Equation (15) gives the aperture weighting function for the continuous array. Using discrete 
elements, this function will be sampled at the sensor location, resulting in the sensor wei^ts given by 
Equation (27) as follows: 



*^<«')=Ii;c».(«)i;'"(5„<j,,). 



(27) 

where the index.denotesthe.-th sensor. Hie array factor given in Equation (16) nowtinnsintoasum 
according to Equation (28) as follows: 

F(5,(p,fl>)=-i,gA.(^„f,,.a,)(?(6»,^„r„^>,^,fi,) 

(28) 

Withadiscretearray.spatialaliasingshould be takeninto account Similar to time aliasing spatial 
ahasmg occurs whenaspatial function, e.g., the spherical harmonics, is undersampled. Forexampl'e in 
crderto distinguish 16 harmonics, at least 16 sensors areneeded. In addition, thepositions of the ^«,rs 
are miportant For this description, it is assumed that there are a sufficient number of sensors located in 
smtable positions such that spatial aUasing effects can be neglected m that case. Equation (28) will 
become Equation (29) as follows; 

(29) 

at least substantially) satisfied as follo\s^: 



which requires Equation (30) to be (at least substantiaUy) satisfied as follows: 



(30) 



wo 03/061336 



-17- 



PCTAJS03/00741 



To account for deviations, a coirection factor can be introduced. For best performance, this factor 
should be close to one for all n^m of interest. 



Robustness Measure fWhite Noise Gain) 

The white noise gain (WNG), which is the inverse of noise sensitivity, is a robustness measure 
with respect to errors in the array setup. These errors include the sensor positions, the filter weights, and 
the sensor self-noise. The WNG as a function of frequency is defined according to Equation (3 1) as 
follows: 



j=0 



(31) 



The numerator is the signal energy at the output of flie array, while the denominator can be seen as the 
output noise caused by the sensor self-noise. The sensor noise is assumed to be independent jfrom sensor 
to sensor. This measure also describes the smsitivity of the array to errors in the setup. 

The goal is now to jBnd some general approximations for the WNG that give some indications 
about the sensitivity of the array to noise, position errors, and magnitude and phase errors. To simplify the 
notations, the look direction is assumed to be in the z-direction. The numerator can then be found firom 
Equation (28) according to Equation (32) as follows: 

\F{QM^ = M£c„(fi>)i;(0,0) 



(32) 

where N\s the highest-order mode used for the beamforming. The number of all spherical harmonics up to 
iV* order is {N+\f. The denominator is given by Equation (27) according to Equation (33) as follows: 



A/-1 



=z 



j=0 



(33) 



Given Equations (32) and (33), a general predictira of the WNG is difficult. Two special cases will be 
treated here: first, for a desired pattern that has only one mode and, second, for a si^erdirectional pattern 
for which bf/«bN.i (conqjare Fig. 3A). 



wo 03/061336 



-18- 




yUS03/00741 



If only mode i\^is present in the pattern, the WNG becomes Equation (34) as follows: 



WNGia)^ 



i2 2N+1 



CM) 



i%(a>) 



An: 



_ M'\b,{a>t 



(34) 

For the omnidirectional (zero-order) mode, the numerator of Equation (34) equals M. Since is unity for 
low frequency (compare Fig. 3 A). WNC3=A/. TTus is the well-known result for a delay-and-smn 
beamfomier. It is also the highest achievable WNG. As the frequency increases, decreases and so does 

the WNG. For other modes, the numerator is d^d^xt on the samplmg scheme oftheanayandhas to be 
determined individuaDy. 

Another coarse approximation can be given for the superdirectional case when brr^<b, , In this 

case, the sumover the (iVH-l)3 modes inthenominator is dominated by theiV-th mode and, 
(32) and (33). the WNG results in Equation (35) as follows: 



WNG(<a) = 



«=0 



2n + l 



An 



^2/2 + 1 



i;|p^(cos5,)f 

5=0 



Equation (35) can be fimher sinplified if the term C„V(2«-M/(4.)) is constant for all modes. Tins would 
result m a smc-shaped pattern. In this case, fee WNG becomes Equation (36) as follows: 

WNG{a)) = ~-l L— j£,„(G,)f 

2;|P^(cos^,)f 



(35) 



IhisresultissimilartoEquation(34),exceptfl^theWNGisincreasedbyafectorof(iV4-l)^ Thisis 
reasonable, since every mode that is picked up by the array increases the output signal level. 



(36) 



Pattern Synthesis 

^^-^^--iUgivetwosuggestionsonhowtogetthecoefBdentsC^thatareusedtocomp^ 
the sensor weights /, according to Equation (27). n.e first approach implements a desired beampattem 
/i(9.(p,a». while the second one maximizes the directivity index pi). Tlere a« many more , 



' ways to design 



wo 03/061336 




PCT/US03/00741 



a beampattem. Bofh methods desc^bed below will assume a look direction towards 0=0. After those two 
methods, the subsequent section describes bow to turn the pattern, e.g., to steer the main lobe to any 
desired direction in 3-D space. 



Implementing a Desired Beampattem 

For a bean^)attem with look direction S=0 and rotational symmetry in (p-direction, the coejE&cients 
C„m can be computed according to Equation (37) as follows: 

C„(a)) = lit \yS&,<p)H^&, a)) sin MB 

0 

(37) 

The question remains how to choose the pattern h itself. This depends very much on the application for 
which the array will be used. As an example, Table 1 gives the coefficients C„ in order to get a 
hypercardioid pattem of order «, where the pattern h is normalized to unity for the look direction. The 
coeflBcients are given up to third order. 



Order 


Co 


c, 


C2 


C3 


1 


0.8862 


1.535 


0 


0 


2 


0.3939 


0.6822 


0.8807 


0 


3 


0.2216 


0.3837 


0.4954 


0.5862 



Table 1: Coefficients for hypercardioid patterns of order n. 
Fig. 10 shows the 3-D pattem of a third-order hypercardioid at 4 kHz, where the microphones are 
positioned on the surface of a sphere of radius 5 cm at the center of the faces of a truncated icosahedron. 
Ideally, the pattern should be frequency independent, but, due to the sampling of the spherical surface, 
aliasing effects show up at higher frequencies. Li Fig. 10, a small effect caused by the spatial sampling can 
be seen in the second side lobe. The pattem is not perfectly rotationally symmetric. This effect becomes 
worse with increasing frequency. On a sphere of radius S cm, this sampling scheme will yield good results 
up to about 5 kHz. 

If the pattem from Fig. 10 is implemented with frequency-independent coeflScients C„, problems 
may occur with the WNG at low frequencies. This can be seen in Fig. 1 1 . In particular, higher-order 
patterns may be difficult to implement at lower frequencies. On the other hand, implementing a pattem of 
only first order for all frequencies means wasting directivity at higher frequencies. 

Instead of choosing a constant pattem, it may make more sense to design for a constant WNG. 
The quality of the sensors used and the accuracy with which the array is built detOTnine the allowable 
minimum WNG that can be accepted. A reasonable value is a WNG of -10 dB. Using hypercardioid 
patterns results in tiie following frequency bands: 50 Hz to 400 Hz first-order, 400 Hz to 900 Hz second- 
order, and 900 Hz to 5kHz fturd-order. The upper limit is determined by the TIA and the radius of the 



wo 03/061336 

^F/US03/00741 

-20- 



m 



sphere ofScm. Fig. 1^ shows the basic sh^e of the resulting fflter. C„(«>), where the tran^tions are 
preferably smoothed out, which wiU also give a more constant WNG. 

Maximizinf> the rUtectivitv Tndev 

.adex m. A co»,™„, f„, a» wM,. noise pin (WNG) i. tacI^W in optaizado. 
'*«''«".yin*xisd.iined,s<he„d„„ffte,„e^p,W„p^3^„ 
to energy picw up by « »™,idi,^to^ ^croph™ », i.«^o noi« Md, whe„, bath 
nn».,,hone, have d» ^e se„^v«y towards the direction, ffthe di,«*, n^erophcne i. „p«ed 
M a aphencally isotropie Md, the DI can be seen as the aeou«icaI sisnai-t<HK™te in>p:„>^, 
achieved by the directive microphone. 

For an array, the DI can be written in n^atrix notation accordmgto Equation (38) as foUows: 

where the frequ«.cy dependence is on^tted for betterreadabihty. The vec^ 
at Sequency Bo according to Equation (39) as follows: 

loot dtrection at ^.Foraptessute sensor dose,„arigid^e,e.«,esc values canb.^^^^ 

Ri.tospati^c™hti„n,n.hix. ll«n»«, are defined by Equadon (40) 

1 ^^'^ 



In matrix notation, the WNG is given by Equation (4 1) as follows: 

WNG = ^ 

miastr^uir^pieceis to express the sensorwei^tsusi^^ 

Equation (27), which c» again be written in matrix notation according to Equation (42) as follows: 

h=Ac. 

(42) 



wo 03/061336 ^^^^ PCT/US03/00741 

-21- 



The vector c contains the spherical harmonic coefficients C„„ for the beampattem design. This is the 
vector that has to be determined. According to Equations (27) and (19), the coefiBcients of A for the rigid 
sphere case with plane-wave incidence are given by Equation (43) as follows: 

(43) 

The notation assumes that only the spherical harmonics of degree 0 are used for the pattem. If necessary, 
any other spherical harmonic can be included. The goal is now to maximize the DI with a constraint on the 
WNG. This is the same as minimizing the function \lf, where the Lagrange multiplier z is used to include 
the constraint, according to Equation (44) as follows: 

1 1 1 



f DI WNG 

(44) 

One ends up with the following Equation (45), which has to be maximized with respect to the coefiEicient 
vector c: 

c"A»PAc 
^^""^ c«A«(R + fI)Ac' 

(45) 

where I is the unity matrix. Equation (45) is a generalized eigenvalue problem. Since A, R, and I are full 
rank, the solution is the eigenvector corresponding to Equation (46) as follows: 

max {;i ((A** (R + ^ I) A)"' (A^P A))} , 

(46) 

where X{) means "eigenvalue from." Unfortunately, Equation 45 cannot be solved for e. Therefore, one 
way to find the maximum DI for a desired WNG is as follows: 

Step (1): Find the solution to Equation (46) for an arbitrary e. 

Step (2): From the resulting vector c, compute the WNG. 

Step (3): If the WNG is larger than desired, then return to Step (1) using a smaller e. If the WNG is 
too small, then return to Step (1) using a larger e. If the WNG matches the desired WNG, then the process 
is complete. 

Notice that tiie choice of e=0 results in the maximum achievable DL On the otha: hand, s-><x> 
results in a delay-and-sum beamformer. The latter one has the maximum achievable WNG, since all 
sensor signals will be summed up in phase, yielding the maximum ou^ut signal, fic) depends 
monotonically on s. 



wo 03/061336 

t/ US03/00741 

-22- 



Fig. 1 3 shows the maximum DI that can be achieved with the TIA using spherical hamonics up to 
orderA^withoutaconstraintontheWNG. Fig. 14 shows the WNG corresponding to the maximum DI in 
Fig. 13. As long as the pattern is superdirectional, the WNG increases at about 6i\rdB per octave. The 
maximum WNG that can be achieved is about lOIogA/, which for the TIA is about 15 dB TWs is the 
value forananrayin free field. hFig. H for the sphei.-baffled anay, the maxhnum WNG is a bit higher 
about 17dB. Once the maximum is reached, it decreases. This is due to fact that the mode number in the' 
array pattern is constant Since the mode magnitude decreases once a mode has reached its maximum, the 
WNG IS expected to decrease as soon as the highest mode has reached its maximum. For example, the 
third-order mode shows this for fa3mz (compare Fig. 3A). 

Fig. 15 shows the maximum DI that can be achieved with a constraint on the WNG for apattem 
that contains the spherical harmonics up to third order. Here, one can see the tradeoff between WNG and 
DI. The higher the required WNG, the lower the maximum DI, and vice versa. For a minimum WNG of 
-5 dB, one gets a constant DI of about 12 dB in a frequency band froni about 1 kHz to about 5 kHz. 
Between 100 Hz and 1 kHz, the DI increases from about 6 dB to about 12 dB. 

Figs. 16A-B give the magnitude and phase, respectively, of the coefficients computed according to 
the procedure described above in this section, where N was set to 3, and the minimum required WNG was 
about -5 dB. CoeflTicients are normalized so that the array factor for the look direction is miity. 
ComparingthecoefiScientsfromFigs. 16A-B with the coefficients from Fig. 12. one finds that they are 
basicallythesame. Only the band transitions are more precise in Figs. 16A-B in order to keep the WNG 



constant 



Rotating the Directivity Paitpm 

After the pattern is generated for the look direction S=0, it is relatively straightforward to turn it to 
a desired direction. Using Equation (27), the weights for a q,-symmetric pattern are given by Equation (47) 



as follows 



Substituting Equation (3) in Equation (47), one ends up with Equation (48) as foDows: 



(47) 



■<Po) 



(48) 



wo 03/061336 




PCT/US03/00741 



Comparing Equation (48) with Equation (27), one yields for the new coefficients Equation (49) as follows: 

cUo>) = C.(a)) /^^^P;(cos5,)e-^ 
)|(n + 7n)! 

(49) 

Equation (49) enables control of the S and 9 directions independently. Also the pattern itself can be 
iinplemented independently from the desired look direction. 

Implementation of the Beamformer 

This section provides a layout for the beamformer based on the theory described in the previous 
sections. Of course, the spherical array can be implemented using a filter-and-sum beamformer as 
indicated in Equation (28). The filter-and-sum approach has the advantage of utilizing a standard 
technique. Since the spherical array has a high degree of symmetry, rotation can be performed by shifting 
the filters. For example, the TIA can be divided into 60 very similar triangles. Only one set of filters is 
computed with a look direction normal to the center of one triangle. Assigning the fiilters to different 
sensors allows steering the array to 60 different directions. 

Alternatively, a scheme based on the structure of the modal beamformer of Fig. 1 may be 
implemented. This yields significant advantages for the inq>Iementation. Combining Equations (27), (28), 
and (49), an expression for the array output is given by Equation (50) as follows: 

(50) 

Referring again to Fig. 1, audio system 100 is a second-orda: system. It is straightforward to 
extend this to any order. Fig. 17 provides a generalized r^Hcsentation of audio systems of the present 
invention. Decomposer 1704, correspOTiding to decomposer 104 of Fig. 1, performs the orthogonal modal 
decomposition of the sound field measured by sensors 1702. Jn Fig. 1 7, the beamformer is represented by 
steering unit 1706 followed by pattern generation 1708 foUowed by frequency response correction 1710 
foUowed by summation node 1712. Note that, in general, not all of the available eigenbeam outputs have 
to be used when generating an auditory scene. 

In audio system 100 of Fig. 1, deconqroser 104 receives audio signals from S different sensors 102 
(preferably configured on an acoustically rigid sphere) and generates nine different eigenbeam outputs 
corresponding to the zeroorder (h=0), first-order (n=l). and second-order («=2) spherical harmonics. As 
represented in Fig. 1, beamformer 106 comprises steering unit 108, compensation unit 110, and summation 
unit 112. In this particulm: implementation, the frequency-response correction of compensation unit 110 is 
^lied prior to pattem generation, which is implemented by summation unit 1 12. This differs from the 
r^sentation in Fig. 17 in which correction unit 1710 performs frequency-response correction after 



wo 03/061336 

^jf/US03/00741 

-24- 



adva„U,gec« ,„ have a,e c„ec«o„ beta «» ^ ™t 1, g™^ ^ ^ 
pattern generation, and correction is possible. 

Modal Decomp nRCT 

'*=»-°P<»«'lM<>™S. lis»sp<,„^bIefo,decomp»^ 

*e m»d field is W»™.d ir™ fte d,« „ foq„e„„y doimi. mtt (he W>1 dcnain ■ Tie 
-d.«™«e.. anab^s of d,e deco„p«i,<,„ „^ 

To sm^he, a toe do™i„ implen^aSon, one ca„ al» „„,k a« ^ 3., i^gj^^^ 
sphedol b^onies. This „i„ resuH i„ real-v.,„ed eoefflde„,s „hioh a. suiuble for . 

^.o«CB,«oaPo,aconan„o„sspheHo.,se»or»id,ao^e^ep.^^ 

plj as follows: 



•M = Re{j;'"(6»,^)} = l 



'(r(^,^)+17'"(^,«p))fori»even 
.(r(^,^)-17''(^.9'))formodd 



(51) 



the array output F given by Equation (52) as foUows: 

If the sensitivity equals the imaginary part of a ^herical harmonic, then the beart^attem of the ^''^ 

co^esponding array fector Will also be the imaginarypartofthissphericalh^^^^^ 

harmonic is frequency weighted. Toconrpensatefor this freque^^ 

Fig. 1 may be implemented as described below in conjunction with Fig. 20. 

For a practical implementation, the continuous spherical sensor is r^laced by a discrete spherical 
a^y^L^tMs case, the integralsin the equations become sums. As before, the se^^ 
sat^fy (as close as practicable) the orthonormality property given by Equation (53) as follows: 

(53) 

whe„ ^ is ae number of sensors, and IS.. desonT,es .heir positas. If fce rigla side of Equation (53) 



wo 03/061336 



-25- 



PCTAJS03/00741 



Fig. 18 represents the structure of an eigenbeam former, such as generic decomposer 1704 of Fig. 
17 and second-order decomposer 104 of Fig. 1. Decomposers can be conveniently described using matrix 
notation according to Equation (54) as follows: 

(54) 

where describes the output of the decon^oser, s is a vector containing the sensor signals, and Y is a 
(2NH-1)" X S matrix, where iV^is the hi^est order in the spherical harmonic expansion. The columns of Y 
give the real and imaginary parts of the spherical harmonics for the corresponding sensor position. Table 2 
shows the convention that is used for numbering the rows of matrix Y up to fifth-order spherical 
harmonics, where n corresponds to the order of the spherical harmonic, m corresponds to the degree of the 
spherical harmonic, and the label nm identifies the row number. For a fifth-order expansion, matrix Y has 
(2iV+l)- or 36 rows, labeled in Table 2 firom nm=0 to nm=35. For exan:5)le, as indicated in Table 2, Row 
;7w=21 in matrix Y corresponds to the real part (Re) of tihe spherical harmonic of order (n=4) and degree 
(m=3), while Row nm=Q2 corresponds to the imaginary part (Im) of that same spherical harmonic. Note 
that the zero-degree {m=G) spherical harmonics have only real parts. 



n 


0 


1 


1 


1 


2 


2 


2 


2 


2 


m 


0 


0 


l(Re) 


lam) 


0 


l(Re) 


l(Im) 


2 (Re) 


2 dm) 


nm 


0 


1 


2 


3 


4 . 


5 


6 


7 


8 


n 


3 


3 


3 


3 


3 


3 


3 


4 


4 


m 


0 


l(Re) 


l(Im) 


2 (Re) 


2(Im) 


3 (Re) 


3(Im) 


0 


l(Re) 


nm 


9 


10 


11 


12 


13 


14 


15 


16 


17 


n 


4 


4 


4 


4 


4 


4 


4 


5 


5 


m 


l(Im) 


2 (Re) 


2(Im) 


3 (Re) 


3(Im) 


4 (Re) 


4(Im) 


0 


l(Re) 


nm 


18 


19 


20 


21 


22 


23 


24 


25 


26 


n 


5 


5 


5 


5 


5 


5 


5 


5 


5 


m 


l(Im) 


2 (Re) 


2(Im) 


3 (Re) 


3(Im) 


4 (Re) 


4(Im) 


5 (Re) 


5(Im) 


nm 


27 


28 


29 


30 


31 


32 


33 


34 


35 



Table 2: Numbering scheme used for the rows of matrix Y 



Steering Unit 

Fig. 19 represents the structure of steering units, such as generic steering unit 1706 of Fig. 17 and 
second-order steering unit 108 of Fig. 1. Steering units are responsible for steering the look direction by 

(P^- The mathematical description of the output of a steering unit for the order is given by 
Equation (55) as follows: 

r„(5-5o,^-^o) = ^|^C(oos(5„))... 

(cos(m%)Re{7„'"(£i,p)}+sin(/n^„)lin{r;(.9.<p)}) 



wo 03/061536 

-26- 



:/US03/00741 



Compensation TTnit ^^^^ 
As tecribed previously, outpu, of fte decon,pos« is ire„BK!y dqieatot te^oy. 
c»r»aio^ asperfo™.,, by generic eom«io. ™i, ,7,0 of Fig. 17 «,ds«=o.d-orf^ 
.^mpe-sarion »,U,0 of Fig. 1, ^j„s,s for «s ^ueney <iep«.^ g« . fe^^..^^ 
r^Won of a,e spberieal harmonics to can be used, eg, by g^cric s„.™,a», uod. .7,2 of „g 
17 and second-order su,»,Mion unit „2 of Fig. l,ingeoemtogfteb«,„patem 

Fig. 20A shows Ore liequency.wei*tog fin,«io„ of fte deco„,po«r „u^ while Fig. 20B shows 
Ihe co^esponding frc,uency-,«,»,». c„„Mion to should be applied, where ft. fie,„«,cy™onse 

— is ^taplyae inverse Of d.e«^„e„cy-weighbughn,c.„„.l,ftis case. fte,,™.,erfi.cd» 
fi.,.ency.™,»,scc»recti„nn.yb.in^len».edas.band.s.op filter co^ 
fitaconfiguredhrp^lcl wifl, an ,,-order low-pass fdte,, where „ is dre order of dreeo-esponding 
sphmcalhamr^icourput A. low l.,,he gain has „ be lin.^d» a reasonable fie»r. Ato,«.ed,a,Pig 
2(lonlyshowsd>cn,a8nih,dc;theco,TespondtogphasecanbefoundtanE,«iona9). 

Summation Unit 

S„°™tio.uni,,,2ofFig.Iperforn.fteacn,alb..ntfonningfors,stem,0«^ Sunnnaion uni, 
mwergh^ ea<*ha,nK^cbyafic,uency response and drcnsun^ up fl,ew«^i3,^,„^^,^ 
b»»tancr ouipu, (i.e., drc auditory scene,. -Hus i, equivalen. ,0 dre processing represented by patn^n 
generation unit 1708 and summation node 1712 of Fig. 17. 

Choosing the Airav ParamPtPro 

The three major design parameters for a spherical microphone anay are: 

0 The number of audio sensors (S); 

0 The radius of the sphere (a); and 

0 The locatirai of the sensors. 
Ilep.rame.en, J and a detemine to array properties of which dre nrost important ones a,,- 
^ The Whit, noise gain (WNG), which hrdirecdy specifies the lower end of ^ fie^ 

0 'n«"PPerfiequ«icylimit,wUchisdetnminedbyspaaalaliasing.and 

o The m.^ ...^ hamromc, to can be rcali^ wid, ^ ^ 

(tto . also d^den, on fl» TOG). This Win also detennhre d» n« 

achieved with the array. 

From a P«fonnan«poin. of view, d« best choices are big spheres with large numbers of senso... 
However, the nnrtf^r of sensor, may be r^^cted in a real-thne mrplementati^ 



wo 03/061336 




PCT/US03/00741 



-27- 



hardware to perform the required processing on all of the signals from the various sensors in real time. 
Moreover, the numba: of sensors may be effectively limited by the capacity of available hardware. For 
example, the availability of 32-channel processors (24-channel processors for mobile applications) may 
impose a practical limit on the number of sensors in the microphone array. The following sections will 
give some guidance to the design of a practical system. 

Upper Frequency Limit 

In order to find the upper frequency limit, depending on a and S , tihie approximation of Equation 
(56), which is based on the sampling theorem, can be used as follows: 



The square-root term gives the approximate sensor distance, assuming the sensors are equally distributed 
and positioned in the center of a circular area. The speed of sound is c. Fig. 21 shows a graphical 
representation of Equation (56), representing the maximum firequency for no spatial aliasing as a function 
of the radius. This figure gives an idea of which radius to choose in order to get a desired upper frequency 
Undt for a given number of sensors. Note that this is only an approximation. 

Maximum Directivity Index 

The minimum number of sensors required to pick up all harmonic components is (iV+1)^, where N 
is the order of the pattern. This means that, for a second-order array, at least nine elements are needed and, 
for a third-order array, at least 16 sensors are needed to pick up all harmonic components. These numbers 
assume the abiUty to generate an arbitrary beampattem of the given order. If the beampattems can be 
restricted somehow, e.g., the look direction is fixed or needs to be steered only in one plane, then the 
number of sensors can be reduced since, in those situations, all of the harmonic components (i.e., the fiill 
set of eigenbeantis) are not needed. 

Robustness Measure 

A general expression of the white noise gain (WNG) as a function of the number of microphones 
and radius of the sphere cannot be given, since it depends on the sensor locations and, to a great extent, on 
the beampattem. If the beampattem consists of only a single spherical harmonic, then an approximation of 
the WNG is given by Equation (57) as follows: 




(56) 



WNG(ia,S,f)-S'\b„{a,f)\' 



(57) 



wo 03/061336 



-28- 



m 



'/US03/00741 



The fector 6„ represents the mode strength (see Fie 20A^ Th. 

^^^^«3showsthegamthatisachievedduetothenumberofsen.on. li can be seen th^ tr, • 
m general is quite significant, but increases by only 6 dB when the „un,K . «^ that the gam 

y umy o OB when the number of sensors is doubled. 



s 


12 


16 


20 


24 


32 


201og(5) fdB^ 


22 


24 


26 


28 


30 



P«o„,ar, *e fig»» Show fte ™^ ^ 3, , „ . 

Preferred Array Parameters 

assumed to be 24 For an iir,n«- fiv. o "^™nnumber of sensors is 

audio .J . ' ^ ^ 32 fco^^^g 

-uulo signajs. r able 6 identifies the sensor location^! fm- ««« -ui • , 
Taw» -7 • J . ^ »«canons tor one possible six-element spherical arrav «r.A 

Table 7 identifies the sensnrW.f,„„.fl, . . P"wicai array, and 



Sensor # 




Qn 


a [mm] 


1 


108 


37.38 


37.5 


2 


180 


37.38 


37.5 


3 


252 


37.38 


37.5 


4 


-36 ] 


37.38 


37.5 



wo 03/061336 




PCT/US03/00741 



5 


36 


37.38 






-72 


142.62 




7 


0 


142.62 


37 S 




79 


142.62 


37 S 


9 


144 


142.62 


37 5 


10 


216 


142.62 


37 5 


11 


108 


79.2 


37 5 


12 


180 


79.2 


37 5 


13 


252 


79.2 


37 5 


14 


-36 


79.2 


37.5 


15 


36 


79.2 


37.5 


16 


-72 


100.8 


37.5 


17 


0 


100.8 


37.5 


18 


72 


100.8 


37.5 


19 


144 


100.8 


37.5 


20 


216 


100.8 


37.5 



Table 4: Locations for a 20-elemeiit icosahedron spherical array 



Sensor # 




an 


a [mm] 


1 


0 


37.38 


37.5 


2 


60 


37.38 


37.5 


3 


120 


37.38 


37.5 


4 


180 


37.38 


37.5 


5 


240 


37.38 


37.5 


6 


300 


37.38 


37.5 


7 


0 


79.2 


37.5 


8 


60 


79.2 


37.5 


9 


120 


79.2 


37.5 


10 


180 


79.2 


37.5 


11 


240 


79.2 


37.5 


12 


300 


79.2 


37.5 


13 


30 


100.8 


37.5 


14 


90 


100.8 


37.5 


15 


150 


100.8 


37.5 


16 


210 


100.8 


37.5 


17 


270 


100.8 


37.5 


18 


330 


100.8 


37.5 


19 


30 


142.62 


37.5 


20 


90 


142.62 


37.5 


21 


150 


142.62 


37.5 


22 


210 


142.62 


37.5 


23 


270 


142.62 


37.5 


24 


330 


142.62 


37.5 



Table 5: Locations for a 24-element "extended icosahedron" spherical anray 



Sensor # 


9n . 


an 


a [mm] 


1 


0 


90 


10 


2 


90 


90 


10 


3 


180 


90 


10 



wo 03/061336 



-30- 



4 


270 


90 


__iol 


5 


0 


0 


-_IO_J 


6 


0 


180 


10 



9 



yUS03/0074I 



Table 6: Locations for a six-element icosahedron spherical array 



Sensor # 






a [mm] 


1 


0 


0 


10 


2 


0 


109.5 


10 


3 


120 


109.5 


10 


4 

7" T nrtn-tij^mm 


240 


109.5 


10 



I array 



One problem that exists to at least some extent with each of these configurations relates to spatial 
I. At higher frequencies, a continuous soundfield cannot be uniquely represented by a finite number 
of sensors. Ms causes a violation of Ihe discrete orthonormaJity property that was discussed previously. 
As a result, the ergenbeam representation becomes problematic. This problem can be overcome by using 
sensors that integrate the acoustic pressure over a predefined aperture. This integration can be 
characterized as a "spatial low-pass filter." 



Spherical Anav with Int egrating Sensnn; 

Spatial ahasing is a serious problem that causes a limitation of usable bandwidth. To address this 

problem^amodallow-passfiltermaybeemployedasananti-aliasingfilter. Since this would suppress 

higher-order modes, the frequency range can be extended. The new upper frequency limit would then be 
caused by other factors, such as the computational capability of the hardware, the A/D conversion or the 
"roundness" of the sphere. 

One way to implement a modal low-pass filter is to use microphones with large membranes. These 
nucrophones act as a spatial low-pass filter. For example, in free field, the directional response of a 
microphone wifli a circular piston in an infinite baffle is given by Equation (58) as follows- 



kasinS 



wheieyistheBesselfimction,aistheradiusofthepiston.and5istheangleoff-axis. lUis is referred to 
as a spatial low-pass filter since, for s^I arguments (fe sin S « 1). the sensitivity is high, while for 
large argmnents. the sensitivity goes to zero. This means, that only sound from a limited region is' 
m:orded. Generally tins behavior is true for pressure sensors with a sigmficant (relative to the acoustic 
wavelength) membrane size. Hie foHowing provides a derivation for an expression for a confonnal patch 
microphone on fee surface of a rigid sphere. 



(58) 



• 



wo 03/061336 PCTAJS03/00741 

-31- 



The microphone output M will be the integration of the sound pressure over the microphone area. 
Assuming a constant microphone sensitivity ;«o over the microphone area, the microphone output Mis then 
given by Equation (59) as follows: 

(59) 

where symbolizes the integration over the microphone area, and G is the sound pressure at location 
[9s59s] on the surface of the sphere caused by plane wave incidence from direction [S, (p], assuming plane 
wave incidence with unity magnitude. Simplifying Equation (59) yields Equation (60) as follows: 

a^m^^TT (l-cos5o) for n = 0 



Y (272 + 1) 



(60) 

Equation (60) assumes an active microphone area from &=0,...,So and cp==0,,..^7c. M^^ is the sensitivity to 
mode n,m. Fig. 22C indicates that fee patch microphone has to have a significant size in order to attenuate 
fee higher-order modes. In addition, fee patch size has an upper limit, depending on fee maximum order 
of interest. For example, for a system up to second order, a patch size of about 60® would be a good 
choice. All ofeer modes would then be attenuated by at least a factor of about 2.5 . Equation (69) allows 
fee analysis of modes only wife w=0. Unfortunately, if a diflferent patch shape or different patch location 
is chosen, a general closed-form solution is difiBcult, if not impossible. Therefore, only numerical 
solutions are presented in fee following section. 

Array of Finite-Sized Sensors 

Ideally, a spherical array feat works in combination wife fee modal beamformer of Fig. 1 should 
satisfy the orfeogonahty constraint given by Equation (61) as follows: 

(61) 

Unfortunately, it is difBcult if not impossible to solve this equation analytically. An alternative approach is 
to use common sense to come up wife a sensor layout and feen check if Equation (70) is (at least 
substantially) satisfied. 

For a discrete spherical sensor array based on fee 24-element "extended icosahedron" of Table 5, 
one issue relates to fee choice of microphone shape. Figs. 23A-D depict fee basic pressure distributions of 
fee spherical modes of feird order, where fee hnes mark fee zero crossings. For fee ofeer harmonics, fee 
shapes look similar. These patterns suggest a rectangular shape for fee patches to somehow achieve a good 



wo 03/061336 

r/US03/00741 



-32- 



match between the patches and the modes. The patches should be fairly large. A good solution is 
probably to cover the whole spherical surface. Another consideration is the area size of flie sensors. 
Intuitively, it seems reasonable to have all sensors of equal size. Putting all these arguments togelher 
yields the sensor layout depicted in Fig. 24, which satisfies the orthogonality constraint of Equation (70) up 
to third order. Although the layout in Fig. 24 does not appear to involve sensors of equal area, this is an 
artifact of projecting the 3-D curved shapes onto a 2-D rectilinear graph. Although there are still 
significant aliasing components fi^m the fourth-order modes, the fifth-order modes are already 
significantly suppressed. As such, the fourth-order modes can be seen as a transition region. 

Practical Impleme ntation of Patch Microp hones 

This section describes a possible physical implementation of the spherical array using patch 
microphones. Since these microphones have almost arbitrary shape and follow the curvature of the sphere, 
patch microphones are preferred over conventional large-membrane microphones. Nevertheless, 
conventional large-membrane microphones are a good compromise since they have very good noise 
performance, tiiey are a proven technology, and they are easier to handle. 

One solution might come with a material called EMFi. See J. Lekkala and M. Paajanen, "EMFi- 
New electret material for sensors and actuators," Procee^;,^, of the 1 Of" International Symposium on 
Electrets, Delphi (IEEE, Piscataway, NJ, 1999), pp. 743-746, the teachings of which are incorporated 
herem by reference. EMFi is a charged cellular polymer that shows piezo-electric properties. The reported 
sensitivity of this material to air-borne sound is about 0.7 mV/Pa. The polymer is provided as a foil with a 
thickness of 70 pm. In order to use it as a microphone, metalization is applied on both sides of the foil, 
and fee voltage between these electrodes is picked up. Since the material is a Ihin polymer, it can be glued 
directlyontolhesurfaceofthesphere. Also the shape of the sensor can be arbitrary. A problem might be 
encountered with the sensor self-noise. An equivalent noise level of about 50 dBA is reported for a sensor 
ofsizeof 3.1 cm^ 

Fig. 25 illustrates an integrated scheme of standard electret microphone point sensors 2502 and 
patch sensors 2504 designed to reduce the noise problem. At low frequencies, signals from fee point 
sensors are used. A low sensor self-noise is especially important at lower frequencies where fee 
beanpattem tends to be superdirectional. At higher frequencies, where fee noise gain is due to fee array, 
signals from fee patch sensor, are used. The patch sensors can be glued on fee surface of fee sphere on top 
of fee standard microphone capsules, hi feat case, fee patches should have only a small hole 2506 at fee 
location of fee pomt sensor capsule to aUow somid to reach fee membrane of fee capsules. 

Bofe arrays - fee point sensor array and fee patch sensor array - can be combined using a simple 
first- or second-order crossover network. The crossover frequency will depend on fee array dimensions. 
For a 24-element array wife a radius of 37.5 mm, a crossover frequency of 3 kHz could be chosen if aU 



wo 03/061336 




PCT/US03/00741 



-33- 



modes up to third order are to be used. The crossover frequency is a compromise between tbe WNG, the 
aliasing, and the order of the crossover network. Concerning the WNG, the patch sensor array should be 
used only if there is maximum WNG from the array (e.g., at about 5 kHz). However, at this frequency, 
spatial aliasing already starts to occur. Therefore, significant attenuation for the point sensor array is 
desired at 5 kHz. If it is desirable to keep the order of tiie crossover low (first or second order), the 
crossover frequency should be about 3 kHz. 

There are other ways to implement modal low-pass filters. For example, instead of using a 
continuous patch microphone, a "sampled patch microphone" can be used. As represented in Fig. 26, this 
involves taking several microphone capsules 2602 located within an efiFective patch area 2604 and 
combining their outputs, as described in U.S. Patent No. 5,388,163, the teachings of which are 
incorporated herein by reference. Alternatively, a sampled patch microphone could be implemented using 
a number of individual electret microphones. Although this solution will also have an upper frequency 
limit, this limit can be designed to be outside the frequency range of interest. This solution will typically 
increase the number of sensors significantly. From Equation (61), in order to get twice the frequency 
range, four times as many microphones would be needed. However, since the signals within a sampled 
patch microphone are summed before being sampled, the number of channels that have to be processed 
remains unchanged. This would also extend the lower frequency range, since the noise performance of the 
sampled patches is lOlog (Sp) better than the self-noise of a single sensor, where iSpis the number of 
sensors per patch. This additional noise gain might allow omitting the microphone correction filters that 
are used to compensate for the differences between the microphone capsules. This would even simplify 
the processing of the microphone signals. 

Alternative Approaches To Overcome Spatial Aliasing 

The previous sections describe the use of patch sensors or sampled patch sensors to address the 
spatial aliasing problem. Although from a technical point of view, this is an optimal solution, it might 
cause problems in the implementation. These problems relate to either the difficulty involved in building 
the patch sensors for a continuous patch solution or the possibly large number of sensors for the sampled 
patch solution. This section describes two other approaches: (a) using nested spherical arrays and (b) 
exploiting the natural diffraction of the sphere. 

In Fig. 2, for example, one sensor array covered the whole frequency band. It is also possible to 
use two or more sensor arrays, e.g., staged on concentric spheres, where the outer arrays are located on 
soft, "virtual" ^heres, elevated over the sphere located at the center, which itself could be either a hard 
sphere or a soft sphere. Fig. 26A gives an idea of how this array can be implemented. For sin?)licity. Fig. 
26A shows only one sensor. The sensors of different spheres do not necessarily have to be located at the 
same spherical coordinates <p. Only the innermost array can be on the sur&ce of a sphere. The 



wo 03/061336 

r /US03/00741 

-34- 



outermost array, having the largest radius, would cover the lower frequency 

array covers the highest frequer.cies.Il.e outputs of the individual arraj^wouldbecorrrbirie^ 

snnple (e.g., passive) crossover network. Assunung the number of microphone is the same for all arrays 

(thrsdoesnotnecessarily need to be the case), the smaller the radius, the smaller the dis^^^ 
microphones and the higher the upper frequency limit before spatial aliasing occurs 

/P«^-l"lyefficientimplen«onispossibleifallofthesensor^^ 
locat^iat the same set of spherical coordinates. In tMs case, instead of 

each drfiferent array, a .dngle beamformer can be used for all of the arrays, where the signals fiom the 
drffer^nt arrays are combined, e.g., using a crossover network, before the signals are fed into the 
beamformer. As such. &e overall number of input channels can be the same as for a single-array 
embodiment having the same number of sensors per array. 

According to another approach, instead of using the entire sensor array to cover the high 
fie^»des.fewerthanall-andasfewasjustasingleone-of^esenso.^ 
hr^ frequencies, inasingle-sensor implementation, it would be pr^^^^^^^^ 
othe desrredsteering angle. This approach exploits the directivity i^^^^^^^ 

^.espher.. For a rigid sphere, this is given by Equation 6. Fig- 26B shows the resulting directivity pattern 
c^„ensoron«resurfaceofasphere(^^^^ 

sx^wouldbeprocessedbytheentiresensorarmy.while the higher frequencybandwouldbere^^^ 

with^ustone orafewmicrophones pointing towards the desired direction. ITre^ 
be combmed by a simple crossover network. 

Maxrohone Cab'hrafinn Filfw^ 

As shown i„ Fig. 27, an e,™B«i<„ fito 2702 can be added b«w«, each microphone 102 
*c™^I.4.fandio^,«,„fpi^ , m on,» » con^sa» f„ „,cr„phc„e .Cc^ce. Sncha 

■=o.^one„,ble,b«nto.06,tfFig. ,tobcdc.ignedwiftalowe,„h,unoisegai„ Each 
.,»aH=..nm,^2,.2has„hccaUWd«..hec<»sp„ndin.n.c,^^^^^ 
^h^onn,™,™san»a«inan.co„*ca,^^^,^,,_.^ 
can be a cumbersome process. 

Fig^28 ahows a block diagnnn of to cahtafion Method for .he „' n.e:oph«» e,«ahzaa„n fito 

vrfa.o»»*ngn,c..».bod«»n.offtep:^h,v..aon.A.h,dic«edinFig.28,.„o,.ge„e.,.„ 
2m g««e, an a»iio aigna. is con««d into an »x»adc n^asnrcMen, sig^^ 

^deao..Sn^»c.oa««2S0.,„Mcha>aocon.ah.,h.n.n.cn^honel02.nda.ef^^^ 
JMS. lb. aud.0 a-gnal gene^ted b, fte n' nnc^phone 1«1 is p»cesscd by e,„aH^„<„ o,e, 2702 
wh,le i^^,^ E«»»«d by rotc^K. n^e„*„ne im is deUyed by deiay etoen. 28.0 by 'an 
amonn.con«p™,di.g„nia«i„„(^^^^„,^^^__^^__^_^^^^ 



wo 03/061336 



• 



-35- 



PCT/US03/00741 



The respective resulting filtered and delayed signals are subtracted from one another at difference node 
2812 to form an error signal e(t), which is fed back to adaptive control mechanism 2814. Control 
mechanism 2814 uses both the original audio signal from microphone 102 and the error signal e(t) to 
update one or more operating parameters in equalization filter 2702 in an attempt to minimize the 
magnitude of the error signal. Some standard adaption algorithm, hke NLMS, can be used to do this. 

Fig. 29 shows a cross-sectional view of the calibration configuration of a calibration probe 2902 
over an audio sensor 102 of a spherical microphone array, such as array 200 of Fig. 2, according to one 
embodiment of the present invention. For simphcity, only one array sensor, with its corresponding canal 
204 for wiring (not shown), is depicted in the sphere in Fig. 29. As shown in the figure, caUbration probe 
2902 has a hollow rubber tube 2904 configured to feed an acoustic measurement signal into an enclosure 
2906 within caUbration probe 2902. Reference sensor 2808 is permanently configured at one side of 
enclosure 2906, which is open at its opposite side. Li operation, calibration probe 2902 is placed onto 
microphone array 200 with the open side of enclosure 2906 facing an audio sensor 102. The calibration 
probe preferably has a gasket 2908 (e.g., a rubber O-ring) in order to form an airtigjit seal between the 
caUbration probe and the surface of the microphone array. 

hi order to produce a substantially constant sound pressure field, enclosure 2906 is kept as small as 
practicable (e.g., 180 mm^), where the dimensions of the volume are preferably much less than the 
wavelength of the maximum desired measurement freqiiency. To keep flie errors as low as possible for 
higher frequencies, enclosure 2906 should be built symmetrically. As such, enclosure 2906 is preferably 
cylindrical in shape, where reference sensor 2808 is configured at one end of the cylinder, and the open 
end of probe 2902 forms the other end of the cylinder. 

The size of the microphones 102 used in array 200 determines the minimum diameter of 
cylindrical enclosure 2906. Since a perfect frequency response is not necessarily a goal, the same 
microphone type can be used for both the array and the reference sensor. This will result in relatively short 
equaUzation filters, since only slight variations are expected between microphones. 

In order to position calibration probe 2902 precisely above the array sensor 102, some kind of 
indexing can be used on tihe array sphere. For example, the sphere can be configured with two Uttle holes 
(not shown) on opposite sides of each sensor, which aUgn with two small pins (not shown) on the probe to 
ensure proper positioning of flie probe during caUbration processing. 

CaUbration probe 2902 enables the sensors of a microphone array, like array 200 of Fig. 2, to be 
caUbrated without requiring any other special tools and/or special acoustic rooms. As such, calibration 
probe 2902 enables in sitii caUbration of each audio sensor 102 in microphone array 200, which in turn 
enables efficient recaUbration of the sensors from time to time. 



wo 03/061336 

^p/US03/00741 

Applications 

can be implemented in different ways. 

fa one implementation, modal decomposer 104 and beamformer 106 a« co-located and operate 
together in real time, m this case, the eigenbeam outputs generated by modal decomposer 104 are 
providedin^ediately to beamformerl06 for use in genera^ 
me control of the beamformer can be performed on-site or remotely. 

Jn another implementation, modal decomposer 104 and beamformer 106 both operate in real time 
butareimplementedindifferent(i.e..non-co.,ocated)nodes.Inthiscase.data ' 
eigenbeam outputs generated by modal decomposer 104, which is implements! at a first node are 
transmitted (via Wired and/or Wireless connections) from 

wtfluneachofwhichabeamfo™erl06isimplementedtoprocasstheeigenbe^ 
the received daa to generate one or more auditory scenes. 

yet anoftc, .optonMon, rr^, deco„,pc»er ,04 and beanfc^r 106 do n« both « 
t^s^m, (i.c., bcanrfo^e, 106 operates subsequent to ^ decon^ ,04). B, a,is case da« 
^^responding to the eigenbeam g.„era,^ ^ ^ 104 are stored, at 

^ue„t^,.bed,,aisrc.Hevedand„sedto,ecover4eeigenbeamou,pn^„bioharetopr„c^ 
die beanfonners .nay be eid,er co-Iocated or non^o-located wid, the modal decomposer 

^ ^°™«=<"«»timpleme„Monsisrep,ese„Wg«,e.icallyi„Fig. lbycbannelsl,4 

wbicb fte eigenbeam outputs generated by modal decomp«« ,04 are provided be»„taer 

,06. •n^'-ctimplemen^uionof channels ,,4wiU,bendependon,hepar,ic„br.pp,ica,^^ l,Fig. 1 

chan™i,,4arerepesent^as.sc, of parallels^ of eiger^eamoutput data ae, one time-™^^ 
»6»beam output for each eigenbearn in the spherical harmonic expanse 

to obtain apphcations. a single beamfomrer, such as beamformer ,06 of Pig. ,. is used to g^erate 
»»ou.p...b^b,additionorai,emative,y,,heeigenbeamou,,n.ge„^ 

m^b.pro„d«,(.i,berinrea,^meor„c„.,e^,ime,andeid.er,ocaayorremo.e>y),„o.e„ 

««.^<».a,Wo™s.ea* Of whichis capable ofinde^tiygenera^ngoneoutpntb^ 
set of eigenbeam outputs generated by decomposer 104. 

^^^-fi-tiondescribesthetheorybehindasphericalmicrophone^^ 
beamfonning to formadesiredspatialresponse to incoming sound ^^^^^ 

approachbrmgsmany advantages over a "conventional', array. For example, (1) it provides a very good 
relahonbetweenmaximumdirectivi^andanaydimensionsC^^^^ 

anayofScm); (2) it aUows very accurate control over the b.«mpattem;(3)thelookdirectioncanbe 



wo 03/061336 





PCT/US03/00741 



-37- 



steered to any angle in 3-D space; (4) a reasonable directivity can be achieved at low frequencies; and (5) 
the beampattem can be designed to be frequency-invariant over a wide frequency range. 

This specification also proposes an implementation scheme for the beamformer, based on an 
orthogonal decomposition of the sound field. The computational costs of this beamformer are less 



An algorithm is descdbed to compute the filter weights for the beamformer to maximize the directivity 
index under a robustness constraint The robustness constraint ensures that the beamformer can be appUed 
to a real-world system, taking into account the sensor self-noise, the sensor mismatch, and the inaccuracy 
in the sensor locations. Based on the presented theory, the beamformer design can be adapted to 
optimization schemes other than maximum directivity index. 

The spherical microphone array has great potential in the accurate recording of spatial sound fields 
where the intended apphcation is for multichannel or surround playback. It should be noted that current 
home theatre playback systems have five or six channels. Currentiy, there are no standardized or generally 
accepted microphone-recording methods that are designed for these multichaimel playback systems. 
Microphone systems that have been described in this specification can be used for accurate surround-sound 
recording. The systems also have the capability of supplying, with Uttle extra computation, many more 
playback channels. The inherent simplicity of the beamformer also allows for a computationally efficient 
algorithm for real-time appUcations. The multiple channels of the orthogonal modal beams enable matrix 
decoding of these channels in a simple way that would allow easy tailoring of the audio output for any 
general loudspeaker playback system that includes monophonic up to in excess of sixteen channels (using 
up to third-order modal decomposition). Thus, the spherical microphone systems described here could be 
used for archival recording of spatial audio to allow for fixture playback systems with a larger number of 
loudspeakers than current surround audio systems in use today. 

Although the present invention has been described primarily in the context of a microphone array 
comprising a plurality of audio sensors mounted on the surface of an acoustically rigid sphere, the present 
invention is not so limited. In reaUty, no physical structure is ever perfectly rigid or perfectly spherical, 
and the present invention should not be interpreted as having to be limited to such ideal structures. 
Moreover, the present invention can be implemented in the context of shapes other than spheres that 
support orthogonal harmonic expansion, such as "spheroidal" oblates and prolates, where, as used in this 
specification, the term "spheroidal" also covers spheres. In general, the present invention can be 
implemented for any shape that supports orthogonal harmonic expansion of order two or greater. It will 
also be understood that certain deviations from ideal shapes are expected and acceptable in real-world 
implementations. The same real-world considerations apply to satisfying the discrete orthonormaUty 
condition apphed to the locations of the sensors. Although, in an ideal world, satisfaction of the condition 
corresponds to the mathematical delta fimction, in real-world implementations, certain deviations from this 



expensive than for a comparable conventional filter-and-sum beamformer, yet yielding a higher flexibiUty, 



wo 03/061336 

E /US03/00741 

-38- 



exact mathematical fonnula are expected and acceptable. Similar real-world principles also apply to the 
definitions of ^^i,at constitutes an acoustically rigid or acoustically soft stmcture. 

The present invention may be implemented as circuit-based processes, including possible 
unplementation on a single integrated circuit. As would be apparent to one skilled in the art. various 
fonctons of circuit elements may also be implemented as proces^g steps in a software program. Such 

software may be employed in, for example.adigital Signal processor, micro^on^^^^ 

computer. *=> k f 

The present invention can be embodied in the form of methods and apparatuses for practicing 
those methods. IHe present invention can also be embodied in the form of program code embodied in 
tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine^ble stooge 

medium, wherein, when the program code is loaded into and executed byamachine,^chasacompu 
Ihemachinebecomesanapparatusforpracticingtheinvention. The present invention can also be ' 
embodied in the form of program code, for example, whether stored in a borage medium, loaded into 

and/or executedbyamachine, or transmitted over some transmission medium or c^^^ 

electncal wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein, when the 

program code is loaded into and executedbyamachine. such asacon^uter.themachinebecom^^ 
apparatusfor practicing the invention, men implemented onageneral-p^^^ 

code segments combine with theproce^toprovideauniquedevicethatoperates^^^^ 

logic Circuits. f ".IV 

«=padUy ««ed o^, each n«n»ri<»J value and :a„ge ahould be l„«p,«ed a. being 
as if to ™ri •«»,„•• „ ■■approxima«Vprec«Jed fte value cf to value 
I. wdl be toae, undemood va™» ehauges i„ to deMs, .nattrials. and a„«^e«s of to 

imts^havebeen describe! and ill„aatedi„™,«,eexplainto„at« of ausinvend^n^be 
n^b,tose*nedi„toa«„itou.depa,«ng,i™topri„eipleands»peoftoi„vend«,a. 
™ in to Mowing clain.. Altough to sieps in to following n,efl,od clai™, if any. are „»*^ in 
u pameuhr sequence wid, co™sp„„ding labeling, unless to clain, ,eci«ions otocwise i^ply , particular 

in,plen»..„g so™ o, all Of tose s.^ ftose s«ps a,, no. necesw^^ 
to bemg m5)lemented m that particular sequence. 



wo 03/061336 





PCT/US03/00741 



-39- 



CLAIMS 



What is claimed is: 

1 . A method for processing audio signals, comprising: 

receiving a plurality of audio signals, each audio signal having been generated by a different sensor of 
a microphone array; and 

decomposing the plurahty of audio signals into a plurality of eigenbeam outputs, wherein each 
eigenbeam output corresponds to a different eigenbeam for the microphone array and at least one of the 
eigenbeams has an order of two or greater. 

2. The invention of claim 1, wherein the eigrabeams correspond to spheroidal harmonics based on a 
spherical, oblate, or prolate configuration of &e sensors in the microphone array. 

3. The invention of claim 1, wherein at least one of the eigenbeams has an order of at least three. 

4. The invention of claim 1 , wherein the microphone array comprises the plurality of sensors 
mounted on an acoustically rigid sphere. 

5. The invention of claim 4, wherein one or more of the sensors are pressure sensors. 

6. The invention of claim 5, wherein at least one pressure sensor conqjrises a patch sensor operating 
as a spatial low-pass filter to avoid spatial aliasing resulting from relatively high firequency components in 
the audio signals. 

7. The invention of claim 6, wherein at least one patch sensor comprises a number of proximally 
configured, individual pressure sensors, wherein, for each such patch sensor, analog signals generated by 
the number of individual pressure sensors are combined before sampling to generate a digital audio signal 
for that patch sensor. 

8. The invention of claim 6, wherein the at least one pressure sensor further comprises a point sensor 
positioned below the patch sensor, wherein: 

the point sensor is used to generate relatively low frequency audio signals; and 
the patch sensor is used to generate relatively high frequency audio signals. 



9. The invention of claim 4, wherein one or more of the sensors are elevated over the surface of the 
sphere. 



wo 03/061336 

-40- 



:/US03/00741 



10. The invention of claim 1, wherein the microphone array comprises the plurality of sensois 
mounted on an acoustically soft sphere. 

1 1. The invention of claim 10, wherein one or more of the sensors are cardioid sensors configured with 
theirnulls pointing towards the center of the sphere. 

12. The invention of claim 1, wherein liie number and positions of sensors in the microphone array 
enable representation of a beampattem as a series expansion involving at least second-order spheroidal 
harmonics. 

13. The invention of claim 12, wherein the number of sensors is based on the highest-order spheroidal 
harmonic in the series e}q)ansion. 

14. The invention of claim 1, wherein the anrangement of the sensors in the microphone array satisfies 
a discrete orthogonality condition. 

15. The invention of claim 1. wherein decomposing the plurality of audio signals fiirther comprises 
treating each sensor signal as a durectional beam for relatively high frequency components in the audio 
signals. 

16. The invention of claim 1, fiirther comprising generating an auditory scene based on the eigenbeam 
outputs and their corresponding eigenbeams. 

17. The invention of claim 16, wherein generating the auditory scene comprises independently 
generating two or more different auditory scenes based on the eigenbeam outputs and their corresponding 
eigenbeams. 

18. The invention of claim 16, wherein generating the auditory scene comprises: 
applying a weighting value to each eigenbeam output to form a weighted eigenbeam; and 
combining flie weighted eigenbeams to generate the auditory scene. 

19. The invention of claim 1, further comprising storing data corresponding to the eigenbeam outputs 
for subsequent processing. 



wo 03/061336 




PCT/US03/00741 



20. The invention of claim 19, further comprising: 
recovering the eigenbeam outputs from the stored data; and 

generating an auditory scene based on the recovered eigenbeam ou^uts and their corresponding 
eigenbeams. 

21. The invention of claim 1, further conqirising transmitting data corresponding to the eigenbeam 
outputs for remote receipt and processing. 

22. The invention of claim 2 1 , further comprising: 
recovering the eigenbeam outputs from the received data; and 

generating an auditory scene based on the recovered eigenbeam outputs and their corresponding 
eigenbeams. 

23. The invention of claim 1, further comprising applying an equalizer filter to each eigenbeam output 
to compensate for frequency dependence of the corresponding eigenbeam. 

24. The invention of claim 1, wherein receiving the plurality of audio signals further comprises 
generating the plurality of audio signals using the microphone array. 

25. The invention of claim 24, wherein receiving the plurahty of audio signals further comprises 
calibrating each sensor of the microphone array based on measured data generated by the sensor. 

26. The invention of claim 25, wherein receiving the plurality of audio signals comprises calibrating 
each sensor of the microphone array using a calibration module comprising a reference sensor and an 
acoustic source configured on an enclosure having an open side, wherein the open side of the volume is 
held on top of the sensor in order to cahbrate the sensor relative to the reference sensor. 

27. The invention of claim 1, wherein the plurality of sensors are arranged in two or more concentric 
arrays of sensors, wherein each array is adapted for audio signals in a dijBferent frequency range. 

28. The invention of claim 27, wherein audio signals from different arrays are combined prior to being 
decomposed into a plurality of eigenbeams. 

29. The invention of claim 1, wherein all of the sensors are used to process relatively low-frequency 
signals, while only a subset of the sensors are used to process relatively hi^-frequency signals. 



wo 03/061336 

-42- 



7US03/00741 



30. ^emventior.ofcIaim29,whereinonlyoneofthesensorsisusedtop™cessther«lativelyW^^ 
frequency signals. 

3 1 . A microphone, conq^rising a pMty of sensors naounted in an arrangement, wherein the number 

andposxtionsofsensors in the arrangement enablerepresentationofabeampattem for themi^^^^^ 
senes expansion involving at least one second-order eigenbeam 

32. ^e-v«xtionofclaim31, wherein theseriesexp^sioninvolvesandgenbeamha^^ 
least three. 



33. Tl,e invention of claim 31. wherein the armgement is one of spherical, oblate, or prolate. 



35. The invention of claim 34. wherein the sensors are pressure sensors. 

36. ll.e invention of claim 35. whe^in at lea^ one pressure sensor comprises a patch sensor ope«ting 

asaspatiallow-passffltertoavoidaliasingresultingfromrelativelyWghfi^uencyco^^^ 
audio signals. 

37. Th. tavcnaon „f ^ ^ of p,„^„ 
c.nfi8u»d, individual p,«^ ^ 

a» nu^ Of individual p,«»„ a. c<»nbin.d l»f»e sampBng to g«»,a« a digial audio sig,„, 

for that patch sensor. 



38. The invention of claim 36. wherein the at least one pressure sensor further comprises a point 
sensor positioned below the patch sensor, wherein: 

the point sensor is used to generate relatively low frequency audio signals; and 
the patch sensor is used to generate relatively high frequency audio signals. 

39. The invention of claim 34. v^erein one or more of the sensors are elevated over the surfece of the 
sphere. 



wo 03/061336 




PCT/US03/00741 



40. The invention of claim 3 1, wherein the plurality of sensors are mounted on an acoustically soft 
sphere. 

41 . The invention of claim 40, wherein the sensors are cardioid sensors configured with tfieir nulls 
pointing towards the center of the sphere. 

42. The invention of claim 31, wherein the second-order eigenbeam corresponds to a second-order 
spheroidal harmonic. 

43. The invention of claim 42, wherein the number of sensors is based on the highest-order spheroidal 
harmonic in the series expansion. 

44. The invention of claim 31, wherein the arrangement of the sensors satisfies a discrete orthogonality 
condition. 

45. The invention of claim 31, further comprising a processor configured to decompose a plurality of 
audio signals generated by the sensors into a plurality of eigenbeam outputs, wherein each eigenbeam 
output corresponds to a different eigenbeam for the microphone array and at least one of the eigenbeams 
has an order of two or greater. 

46. The invention of claim 45, wherein the processor is further configured to generate an auditory 
scene based on the eigenbeam outputs and their corresponding eigenbeams. 

47. The invention of claim 31, wherein the plurahty of sensors are arranged in two or more concentric 
arrays of sensors, wherein each array is adapted for audio signals in a difBerent firequency range. 

48. The invention of claim 47, wherein the sensors in the different arrays are located at the same 
spherical coordinates. 

49. The invention of claim 31, wherein all of the sensors are used to process relatively low-fi-equency 
signals, while only a subset of the sensors are used to process relatively high-fi:equency signals. 

50. The invention of claim 49, wherein only one of the sensors is used to process the relatively high- 
frequency signals. 



wo 03/061336 

JK/US03/00741 

51. A method for generating an auditory scene, comprising: 

receiving eigenbeam outputs, the eigenbeam outputs having been generated by decomposing a pluraUty 
of audio signals, each audio signal having been generated by a different sensor of a microphone amy. 
wherein each eigenbeam output corresponds to a different eigenbeam for the microphone airay and at least 
one of the eigenbeam outputs corresponds to an eigenbeam having an order of two or greater; and 

generating the auditory scene based on the eigenbeam outputs and their corresponding eigenbeams. 

52. The invention of claim 51, wherein generating the auditory scene comprises: 
applying a weighting value to each eigenbeam output to form a weighted eigenbeam; and 
combining the weighted eigenbeams to generate the auditory scene. 

53. The invention of claim 51, wherein generating the auditory scene further comprises applying an 
equalizer filter to each eigenbeam output to compensate for ftequency dependence of the corresponding 
eigenbeam. 



54. The invention of claim 51, wherein the microphone array comprises a plurality of sensors mounted 
in a spheroidal arrangement 

55. The invention of claim 54. wherein the plurality of sensors are mounted on an acoustically rigid 
sphere. 



56. The invention of claim 55, wherein the sensors are pressure 



sensors. 



57. The invention of claim 56. wherein at least one pressure sensor comprises a patch sensor operating 
as a spatial low-pass filter to avoid ahasing resulting from relatively high frequency conq^onents in the 
audio signals. 

58. The invention of claim 57, wherein at least one patch sensor comprises a number of proximaDy 
configured, individual pressure sensors, wherein, for each such patch sensor, analog signals generated by 
the number of individual pressure sensors are combined before sampling to generate a digital audio signal 
for that patch sensor. 

59. nie invention of claim 57, wherein fee at least one pressure sensor further comprises a point 
sensor positioned below tiie patch sensor, wherein: 

the point sensor is used to generate relatively low frequency audio signals; and 



t 



wo 03/061336 PCT/US03/00741 

-45- 

the patch sensor is used to generate relatively higji frequency audio signals. 

60. The invention of claim 55, wherein one or more of the sensors are elevated over the surface of the 
sphere. 

6 1 . The invention of claim 54, wherein the plurality of sensors are mounted on an acoustically soft 
sphere. 

62. The invention of claim 61, wherein one or more of the sensors are cardioid sensors configured with 
their nulls pointing towards the center of the sphere. 

63. The invention of claim 54, wherein the number and positions of sensors in tho microphone array 
enable representation of a beanq)attem as a series expansion involving at least second-order spheroidal 
harmonics. 

64. The invention of claim 63, wherein the number of sensors is based on the highest-order spheroidal 
harmonic in the series expansioa 

65. The invention of claim 54, wherein the arrangement of the sensors satisfies a discrete orthogonality 
condition. 

66. The invention of claim 51, wherein generating the auditory scene further comprises treatmg each 
sensor signal as a directional beam for relatively high frequency components in the audio signals. 

67. The invention of claim 51, wherein receiving the eigenbeam outputs further comprises recovering 
the eigenbeam outputs from data stored during previous processing. 

68. The invention of claim 51, wherein receiving the eigenbeam outputs further conprises recovering 
the eigenbeam outputs from data received after transmission from a remote node. 

69. The invention of claim 5 1 , wherein the number of higher-order eigenbeams used in generating the 
auditory scene is hmited to maintain a minimum value of signal-to-noise ratio (SNR). 



70. The invention of claim 69, wherein the SNR is characterized using white noise gain. 



wo 03/061336 ^ 

T/US03/00741 



71. The invention of claim 5 1, wherein generating the auditory scene comprises independently 
generating two or more different auditory scenes based on the eigenbeam outputs and their corresponding 
eigenbeams. 

72. The invention of claim 51, wherein the plurality of sensors are arranged in two or more concentric 
patterns, each pattem having a plurality of sensors adapted to process signals in a different frequency 
range. 



73. The invention of claim 72, wherein the sensors arranged in the innermost patterns are mounted on 
the surface of an acoustically rigid sphere. 

74. The invention of claim 51, wherein all of the sensors are used to process relatively low-frequency 
signals, while only a subset of the sensors are used to process relatively high-frequency signals. 

75. The invention of claim 74, wherein only one of the sensors is used to process the relatively high- 
frequency signals. 



wo 03/061336 




PCT/US03/00741 



D- 



^1 n?=0 
s=2 

g— 4 

Decomposer 

(Eigen- 
Beamformer) 

n==2 



in=0 



102 



s=S 



m=0 I 




m=l(Re) i 




m=l(Iin)! 




m=0 1 




nQF=l(Re)i 




m=l(Im) ! 




iip=2(Re) 




m=2(Im) 





104 



108 



110- 



100 



106 



112- 



Steering Unit 



n=0 



ff=l 



ff=2 



Compensation Unit 



n=0 



11=1 



Sununation Unit 



Auditory Scene 

Fig. 1 



^'SPAGEBim 



(USPrr 



wo 03/061336 




PCT/US03/00741 




Fig. 2 



m 

THISPA6EBU\NK(usi>To> 



wo 03/061336 




PCT/US03/00741 




mode 0 
mode 1 
mode 2 
mode 3 



Fig. 4 




mode 0 
mode 1 
mode 2 
mode 3 



Fig. 5 



• m 



PAGE BLANK 



(USPtu 



wo 03/061336 




PCT/US03/00741 





Fig. 7 



• m 

THIS PAGE BLANKjMSPK^ 



wo 03/061336 




PCT/US03/00741 




(B) 
Fig. 8 



THISMfiEBlAHKffisMXP 




PCT/US03/00741 



»40 ' ^' 1 . t I 1 I 'I I I L 1 I — I I— 

0.2 0.5 1 2 5 8 



Fig. 8 




THIS PAGE BLANK ai^>»' 



wo 03/061336 




PCT/US03/00741 



/\Z1II1UII1 


XZ/IC V d LiUJJ. 


rp cifiiii^'^ 


1 RO 


0 0 


1 016 






1 016 


79 




1 016 




fs \ 4 


1 016 

X • V X vF 


916 


63 4 


1.016 


- 79 


63 4 


1 016 




116 6 


1 016 

X w\J X \J 


1 OR 


116 6 


1 016 

X •V/ X \J 


1 RO 


116 6 

X X \J»\J 


1 016 

X • V/ X. V/ 


9S9 


116 6 


1 016 


— j\j 


116 6 


1 016 


n 


1 RO 0 


1 016 

X aVr X v/ 


- JO 


37 4 


0 99 




37 4 


0 99 


lOR 


37 4 


0 99 


1 RO 


37 4 


0 99 


9^9 

^ JZ 


37 4 


0 99 


- 79 


142 6 


0 99 


916 


149 6 


0 99 




149 6 


0 99 


79 


149 6 


0 99 


A 
U 


149 6 


0 99 


36 


79.2 


0.99 


72 


100.8 


0.99 


108 


79.2 


0.99 


144 


100.8 


0.99 


180 


79.2 


0.99 


216 


100.8 


0.99 


252 


79.2 


0.99 


-72 


100.8 


0.99 


-36 


79.2 


0.99 


0 


100.8 


0.99 



Fig. 9 



wo 03/061336 <9V PCT/US03/00741 

9/28 



fi=4kHz 

I 130 

dB 




Fig. 10 




THIS MGE BLANK oispTr 




wo 03/061336 flW PCT/US03/00741 

10/28 





50 100 



200 500 1000 2000 
Frequency [Hz] 



5000 



Fig. 12 




THIS PAGE BU\NK 




wo 03/061336 ^8^P PCTAJS03/00741 

11/28 





Frequency [Hz] 

Fig. 14 




THIS PAGE BUNK 




wo 03/061336 «Br PCT/US03/00741 

12/28 




WNG>OdB 

- - WNG>-5 dB 
WNG>-10 dB 



Frequency [Hz] 



Fig. 15 



wo 03/061336 «V ^fe PCT/US03/00741 

13/28 




Fig. 16 



# 



THIS PAGE BLANK OBF 



TO 



wo 03/061336 PCT/US03/00741 

15/28 





THlSPASEBLANKlusPn' 



wo 03/061336 



m 



16/28 



PCT/US03/00741 



si 



# • • • 



3 




• • • 




CO 

O 
o 



o 
o 



A 



• • • 



A A 



On 

r— C 

E 



"SI 

I 




THISPAGEBlWiKOJsno) 




Fig. 20 • 




THIS PAGE BLANK (USPTO 




Fig. 21 



THIS PAGE BlM( jusno< 



• 



wo 03/061336 PCT/US03/00741 

19/28 






Fig. 22 



m 



THIS PAGE BlAMK cispioj 



wo 03/061336 iV PCT/US03/00741 

21/28 



9 

(A) 



9 

(C) 



(B) 



9 



+ 



Fig. 23 



THIS PAGE BiMKiusiw- 



m 



wo 03/061336 PCTAJS03/00741 

22/28 



60 
90 
120 



60 120 180 240 300 360<p[°] 



Fig. 24 




wo 03/061336 PCT/US03/00741 

23/28 



1 2504 1 


2504 J 




2504 


2504 


2504 




2502 






® 


^^^2506 


® 


1 2504 


2504 





(p=90« 



9=120'= 



Fig. 25 





2602 






° o ° 

2604 o p 

o o ^ ^ 

o o o o 











Fig. 26 




THIS PAGE BLANK (USPTO^ 



♦ 



wo 03/061336 PCT/US03/00741 

24/28 




Fig. 26A 




THIS PAGE BLWIK (uspto> 



wo 03/061336 PCT/US03/00741 

25/28 




Fig. 26B 




THIS PAGE BL^OJSPTO) 



wo 03/061336 \.^r PCT/US03/00741 

26/28 



102 

s=l 



D 



D 



s=2 



s=S 



2^02 
Vi(t) - 



Vs(t) 



Decomposer 



104 



y(t) 

— ► 



Fig. 27 




THIS PAGE BLANK iiRPTr 



wo 03/061336 



27/28 



PCT/US03/00741 




2802 



2804 



2806 



2808 



-K3 



102 



2810 



i = n 

















Vn(t) 



2812 



2702 



Adaptive 

Conteol 

mechanism 



2814 



,e(t) 



Fig. 28 



wo 03/061336 




PCT/US03/00741 



2902 




2904 



Fig. 29 




THIS PAGE BLANK (USPTO 



INTERNA 



:al search report 



PC77US 05/00741 



IPC 7 H04R1/40 He4R3/00 //H04R5/027 



AccoKlifigtolntewallonalPMBmaasslllMBonOPQoftobolhna^^ 



a FSLOS SEARCHED 



Mhitoiiiin documentalion seaiched (dassiBcation system fDllmyed by dassiGcafion symbols) 

IPC 7 H04R 



Documentation searched other than miniiniirn documentatton to the extent that such documents are (nduded in the felds sean:hed 



eedronic data liase consulted during the tntematlonal search (name of data base and, where practical, search terms used) 

EPO-Intemal, WPI Data 



C. D0CUB/IENT5 COMSIDERED TO BE RBLEVAMT 



Category * CitaSon of document, v/lth indication, where appropriate, of ttie relevant passages 



Relevant to daim No. 



EP 0 869 697 A (LUCENT TECHNOLOGIES IRC) 
7 October 1998 (1998-10-07) 



page 3, line 44 -page 15» line 38 
abstract 



1-6. 

9-12,14. 

15.19. 

21. 

23-25. 
27-36, 
39-42. 
44.47-50 



16-18, 

20,22, 

45,46, 

51-57. 

60-63. 

65-75 

7.8.13. 

26.37, 

38.43. 



J( Further documents are Bsted In the continuation of box C. 



)(j Patent fainiiymeint>ers are listed in annex. 



Special categories of cited documents : 

"A" document defining the general state of fhe art which Is not 
considered to tie of particular relevance 
earlier document but pubDshed on or after the International 
fling date 

"L* document wKch may throw doubts on priority daim(s) or 
which is cited to establish the publication date oi another 
citation or other special reason (as specified) 

'O" document referring to an oral disclosure, use, exhibition or 
other means 

*P document published prior to the international fifing cbte but 
later than the priority date ct^med 



T" later document ptdiiished after the intemallonat filing d^ 
or pilority date and not in conflict with the application but 
cited to understand the principle or theory underlying the 
invention 

"X" document of particular relevance; the claimed invention 
cannot be considered novel or cannot be considered to 
involve an invsntne step when the document is taken alone 

V documentofparficufar relevance; the claimed invention 
cannot be considered to tnvoh^e an inventive step when the 
document is comtaned with one or more other such docu* 
ments, such combination being obvious to a person skilled 
in the art 

documentmembBT o1 tiie same patent family 



Date of ttie actual completion of the international search 



5 May 2003 



Date of m^ing of the oitematbnal search report 



03. OR 2003 



Name and maifing address of the ISA 

European Patem Office. P.B. 5818 Patentlaan Z 
ML- 2280 HV Rfswipc 
Tel (+31 -70) 340-2040. Tx. 31 651 epo nl. 
Fax: (4-31-70)340^16 



Authorired ofncer 



HEMRIK ANDERSSON/JA A 



Fonn PCT/BA/eiO second sheeO (July 1992) 



page 1 of 2 



INTERNATIO! 



EARCH REPORT 



PCT/US 03/00741 



&(Coiitinualion) DOCUMENTS OONSDEREO TO BERELEVAHT 



Categoiy' Ciiafionofdaounient,wilhlndieatian,whBtsappn>prialB, ofttieietevantpassages 



nalevam to daim No. 



US 6 239 348 Bl (NETCALF RANDALL B) 
29 May 2061 (2001-05-29] 



column 1, line 62 -column 3, line 31 

US 6 317 501 Bl (MATSUO NAOSHI) 
13 November 2001 (2001-11-13) 
the whole document 

EP 0 381 498 A (MATSUSHITA ELECTRIC IND CO 
LTD) 8 August 1990 (1990-08-08) 
the whole document 



58,59,64 

16-18, 
20,22, 
45,46. 
51-57, 
60-63, 
65-75 



1-75 



1-75 



Form PCT/tSASIO (oonlinuation of second sheet) (July 1992) 



page 2 of 



2 



INTERNA 



AL SEARCH REPORT 



PCr/US 03/00741 



Kaiem uocurneiii 








I aicfiL lalliiiy 


1 UUliCaUUil 


dtodin search report 




date 




member{s) 


date 


Er mmmf 


A 

A 




IK 












HP 


D!70Ul/o3 Ul 










UC. 


CQQaiyQC TO 
D7OUI/ 03 1 ^ 










PD 












ID 


lU^ODOOO A 


00 1QOQ 


US 6239348 


Bl 


29-05-2601 


AU 


5916199 A 


03-04-2000 








AU 


7130200 A 


10-04-2001 








LA 


co4ool4 AJ. 










CD 

tp 


iddOD/C Al 


jl-u/-20UZ 








WO 


Ollo/oo Al 


15-03-2001 








wo 


0016306 Al 


23-03-2000 








us 


20O39293Q6 Al 


13-02-2003 








lie 

U5 




14-03-2002 


US 6317501 


Bl 


13-11-2081 


JP 


11018194 A 


22-01-1999 








US 


2002041693 Al 


<f 4 A A A /NAM 

11-04-2002 








us 


ZU020oO9oU Al 


27-06-2002 








IK 




DO-UO-^CIU^ 


EP 8381498 


A 


08-08-1990 


JP 


1996369 C 


08-12-1995 








JP 


2205200 A 


15-08-1990 








JP 


7028470 B 


29-03-1995 








EP 


0381498 A2 


08-08-1990 








KR 


9301076 Bl 


15-02-1993 








US 


5058170 A 


15-10-1991 



Farm PCT/ISA610 (patent family annex) (July 1992) 




THIS PAGE BLANK (usno) 



